DEV Community

Cover image for Vegi: Vegetables are not Aliens
Mikkel Frimer-Rasmussen
Mikkel Frimer-Rasmussen

Posted on

Vegi: Vegetables are not Aliens

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built Vegi, a charming and interactive web applet designed to help young, pre-literate children discover the world of vegetables. The app aims to solve the common challenge of getting kids interested in healthy foods by creating a delightful and playful learning experience.

This could turn into a treasure hunt at the supermarket vegetable aisles!

AI generated cartoon dish image

The guide for this experience is Vegi, a friendly vegetable mascot (A “cool rahbi” in her own words). Using their voice or the device's camera, children can interact with Vegi to identify real-world vegetables and learn facts about them and get ideas for tasty dishes. The app's motto is "Vegetables are not Aliens," turning unknown foods from intimidating to intriguing. Everything that is not classified as a vegetable might be an alien!

Demo

You can try the live applet here:

https://vegi-967579883759.us-west1.run.app/

Find the hidden easter egg!

Video Demonstration:

A short 2 minute video showcasing the app's core multimodal features in action. This is highly recommended to see the full, interactive experience (sound on).

https://drive.google.com/drive/folders/1QC6gmJDK-4dvGi24LZdTv7wSGv0vxxne?usp=sharing

Screenshots:

Main Screen
Vegi Main Screen
The main screen, featuring Vegi and the two primary interaction buttons.

Live View
Live camera view showing a vegetable
The live camera view capturing a real-world vegetable for recognition.

AI Generated Cartoon Dish
AI Image generated dish
The app displaying a generated image of a delicious dish after a voice query.

Note: Vegi handles more than one vegetable at a time.

Known limitations:

  • The app uses the browser text-to-speech. On several smartphones, it required loading and then reloading of the app to work.
  • AI guardrails are very simple currently.
  • Currently limited to English to keep it simple for the competition, but there is no other reason to not use the full language capabilities of Gemini 2.5 Flash.

How I Used Google AI Studio

The entire application was prototyped and built using Google AI Studio, which served as the central hub for development while using Gemini 2.5 Pro as a development strategist and thinking partner for errors that AI Studio could not resolve.

I started by crafting and testing the core multimodal prompts for Vegi's personality and logic directly within the AI Studio environment. Super easy! This allowed for rapid iteration on the AI's responses and behavior.

I then leveraged AI Studio's app-building capabilities to scaffold the entire React frontend and the backend logic. The core intelligence is powered by Gemini 2.5 Flash for its speed and powerful multimodal understanding of both image and audio inputs.

Finally, the application was seamlessly deployed as a containerized applet on Google Cloud Run, directly from the AI Studio workflow.

Multimodal Features

Vegi is built from the ground up on Gemini's multimodal capabilities to create an intuitive experience for children who cannot read or write.

  • Visual Vegetable Recognition (Image-to-Text & Speech): A child can point their device's camera at a real vegetable and snap a picture. Gemini 2.5 Flash then analyzes the image, identifies the vegetable, and generates a simple, fun fact. Vegi then speaks this information aloud. This creates a magical "See & Say" experience that connects the child's physical world with digital learning. Older children might look at the letters below and learn the spelling.
  • Voice-Activated Discovery (Audio-to-Text-to-Image & Speech): A child can simply say the name of a vegetable into the microphone. The app transcribes the speech to text, which is then sent to Gemini to generate a kid-friendly description of its taste and uses. The app then uses this context to generate a cartoonimage of a dish featuring that vegetable. This empowers children to lead their discovery with their voice alone, making learning self-directed and highly engaging.

Next steps

A higher quality graphics version with an animated Vegi mascot, which could make the app more engaging, but not more multimodal..

Vegi in Zen - High Definition image

Developed by:
https://dev.to/mikkel_frimerrasmussen_9
Frimer-Rasmussen Consulting
https://frimer-rasmussen.dk/

Top comments (0)