I built an AI Photography Coach using Google Gemini 3 Pro — it analyzes photos across five dimensions, exposes the AI's reasoning chain, and lets you chat with a mentor that remembers your analysis context. The full project is open-source and the writeup is on Medium.
But the single most transferable lesson from building it wasn't about photography or multimodal AI. It was about how to ask Gemini for structured data without it breaking on you in production.
Here's what I learned.
Quick Project Highlights
📸 Live app: Search "Photography Coach AI" on Google AI Studio
🐙 GitHub: https://github.com/prasadt1/photography-coach-ai-gemini3
📖 Full writeup: https://medium.com/@prasad.sgsits/i-built-an-ai-photography-coach-with-google-gemini-3-pro-heres-everything-i-learned-45411abef25c
The Problem
Early in development I asked Gemini for structured data the way most developers do the first time — in plain English inside the prompt:
"Analyze this photo and return a JSON object with five dimension scores..."
It worked perfectly in testing. It broke constantly in production — markdown fences, preamble text, explanation paragraphs, inconsistent field names. Every variation that JSON.parse() couldn't handle.
The fix is simple once you know it, but it's not obvious from the docs.
Wrong Way
// Wrong Way: Unstructured Prompt
const ai = await getGenAIClient();
const prompt = `Analyze this image and tell me if it's good or bad.
Give me some feedback.`;
const result = await ai.models.generateContent(prompt);
console.log(result.response.text());
// Output: "It's okay. The lighting is bit dark..."
// 💥 Unpredictable format, breaks JSON.parse() in production
The Right Way
// Right Way: Schema-First
const ai = await getGenAIClient();
const schema = {
type: Type.OBJECT,
properties: {
score: { type: Type.NUMBER },
feedback: { type: Type.STRING },
improvements: { type: Type.ARRAY, items: { type: Type.STRING } }
},
required: ['score', 'feedback', 'improvements']
};
const result = await ai.models.generateContent({
model: 'gemini-3-pro-preview',
contents: { role: 'user', parts: [{ text: prompt }] },
config: {
responseMimeType: 'application/json',
responseSchema: schema
}
});
console.log(JSON.parse(result.text));
// Output: { score: 7, feedback: "Good composition...",
// improvements: ["Increase exposure"] }
Why This Works
responseMimeType: "application/json" tells Gemini at the API level to return pure JSON — no markdown fences, no preamble, no trailing explanation. This alone eliminates most production failures.
responseSchema defines the exact contract. Gemini will not return fields outside it or omit required ones. Your frontend parsing becomes deterministic.
Together they shift the reliability burden from your parsing code to the API itself — which is exactly where it belongs.
The Deeper Lesson
Schema enforcement changes how you design prompts. When you define the schema first, you're forced to think clearly about what you actually need from the model. That clarity produces better prompts, better outputs, and fewer surprises at 2am.
Define your schema before you write your first prompt. Not after.
In Photography Coach AI, this schema-first approach is what made it possible to drive five separate UI tabs — Overview, Detailed Analysis, Mentor Chat, AI Enhancement, Economics — all from a single structured Gemini response. No ambiguity, no defensive parsing, no fallback logic for malformed outputs.
Troubleshooting Tips
Empty responses after adding responseSchema: Check that your schema property names exactly match what you're asking for in the prompt. Mismatches between prompt language and schema field names are the most common cause of silent failures
Nested objects failing: Define required arrays at every level of nesting, not just the top level
Numbers returning as strings: Explicitly set type: "number" for all numeric fields — Gemini will default to string if the type is ambiguous
Schema too complex: If your schema has more than ~15 fields, consider splitting into two sequential API calls rather than one monolithic schema
Testing tip: Validate your schema against Gemini's output in AI Studio's playground before wiring it into your frontend — iterate the schema there, not in code
CTAs
⭐ Star the repo: https://github.com/prasadt1/photography-coach-ai-gemini3
📖 Full project writeup on Medium: https://medium.com/@prasad.sgsits/i-built-an-ai-photography-coach-with-google-gemini-3-pro-heres-everything-i-learned-45411abef25c
🚀 Try the live app: https://ai.studio/apps/drive/1v2uJziWHPOHRES4EmmWXavydKZAe8ary?fullscreenApplet=true
🐛 Open an issue with questions or schema edge cases you've hit
Top comments (0)