This is a submission for the Google AI Studio Multimodal Challenge
What I Built
I built GeoGenius, an AI-powered interactive learning companion designed to make learning geography dynamic, engaging, and fun. The problem it solves is simple: traditional learning from static textbooks can be dry and fails to capture the dynamic nature of our planet. GeoGenius transforms any geography topic—from a simple text prompt or even a photo of a textbook page—into a personalized, multi-faceted learning module in seconds.
- The user flow is straightforward:
- A user enters a topic like "Causes of Urbanization" or uploads their study notes.
- They click "Generate Lesson."
- GeoGenius uses AI to create a rich, structured lesson plan that includes:
- Key Concepts: A clear, concise breakdown of the most important ideas.
- Simple Analogies: Complex topics are related to everyday life to make them easier to understand.
- Visual Moments: Each analogy is paired with a unique, AI-generated image to visually reinforce the concept.
- Interactive Animations: For dynamic processes like the water cycle or tectonic plate movement, a simple, hands-on animation is provided.
- Test Your Knowledge: A short, multiple-choice quiz helps the user check their understanding and solidify what they've learned.
It's not just a Q&A bot; it's a creative partner that builds a personalized and comprehensive lesson from scratch, every time.
Demo
link to the applet: https://geogenius-interactive-geography-lessons-598974168521.us-west1.run.app
For a complete walkthrough, here is a short video that shows the entire process from entering a prompt to interacting with the final generated lesson.
How I Used Google AI Studio
Google AI Studio was the core in the development of GeoGenius, particularly for prototyping and refining the core AI prompts.
My workflow was centered around structured output. I knew I needed a reliable JSON object from the AI to render the different components in my React app.
I used the AI Studio playground to:
- Craft the System Prompt: I iterated on the main instruction ("You are a world-class geography teacher...") in AI Studio to find the perfect persona and set of instructions for the model. This ensured the tone and quality of the content were consistently high.
- Define and Test the JSON Schema: The "Structured Prompt" feature was a game-changer. I designed my entire lessonPlanSchema directly in AI Studio, defining the types, required fields, and even providing descriptions for each property. This allowed me to test prompts and see how the gemini-2.5-flash model would populate the schema, ensuring the output was always predictable and ready for my frontend to consume.
- Prototype Multimodal Inputs: I also used the playground to test how the model would respond when given an image (like a page from a textbook) along with a text prompt, which helped me build the file upload feature with confidence.
- After perfecting the prompts and schema in AI Studio, I simply transferred that logic into my application code using the @google/genai SDK. This process saved hours of development time by allowing me to separate prompt engineering from application logic.
Multimodal Features
GeoGenius leverages multimodality to create a richer and more effective learning experience than what text-alone applications can offer.
Multimodal Input (Image + Text → Lesson): The app’s ability to accept an image (like a .png or .jpg of a map or textbook page) alongside a text prompt is its most powerful multimodal feature. A student can literally take a picture of their homework and ask the app to build a lesson around it. This turns static, offline content into a dynamic and interactive digital experience, meeting the user exactly where they are.
Text-to-Structured Data (JSON): At its core, the app transforms an unstructured text request into a highly structured JSON object. This is a crucial modality that allows for a reliable and sophisticated UI. Instead of getting a single block of markdown, the AI provides a predictable data structure that the React frontend can map to distinct, beautifully styled components for concepts, analogies, quizzes, and more.
Text-to-Image Generation: To make abstract concepts more tangible, the AI first generates a creative visualPrompt for each analogy in the lesson plan. My application then takes this text prompt and uses the imagen-4.0-generate-001 model to generate a custom illustration. This on-the-fly image generation ensures that every lesson has unique, contextually relevant visuals, which are far more engaging and effective for learning than generic stock photos.
 
 
              

 
    
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.