DEV Community

Cover image for Look & Learn: a Google AI Multimodal Challenge Entry

Look & Learn: a Google AI Multimodal Challenge Entry

Google AI Challenge Submission

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I created Look & Learn, a daily challenge app for language learners. Each day, one new image is generated, and for every language and skill level, a Γ©quiz is created. You can always go back and play quizzes from earlier days, and if a quiz for that day doesn’t exist yet, it will be generated when you first try it. You can also see which days you've already played, so you can get a streak going!

The app will then ask you some questions about the image, in the language you’re trying to learn. For beginners, all the questions are multiple choice. For intermediate and advanced levels, you’ll have to type out your answers.

Demo

Try Look & Learn here

Screenshots:

Screenshot of the Look & Learn start screen. Here, the user has options to select the language they speak, the language they want to learn, and their level of fluency. A button at the bottom lets them start the quiz.

Image showing a multiple choice question where the user has picked the wrong answer. The wrong answer is highlighted in red, while the correct one is shown in green. At the bottom, an explanation in the user's native language is shown.

Screenshot from an intermediate level Dutch quiz:
Screenshot showing a question where the user had to type in an answer. At the bottom, there's a message indicating that the user has answered correctly, but that there was a grammar issue about verb conjugation, and explaining it.

How I Used Google AI Studio

I wanted to see how far I could take Google AI Studio while touching the code by hand as little as possible. While I'm mostly skeptical of vibe coding, I thought this challenge was an interesting opportunity to give it a try. So, I mostly wrote prompts, and would give the model feedback in natural language.

Multimodal Features

At the start of each daily challenge, the application always uses the image for that day, generating it if it doesn’t exist yet. That image is then the basis for all the quizzes created that day.

Gemini-2.5-flash is used to generate questions about the image, with prompts that include guidelines for the questions as well as the user’s fluency level. For multiple choice questions, the correct answer is clear and immediate feedback is given. For text entry questions, the user’s response is sent back to Gemini-2.5-flash to evaluate correctness and provide grammar and vocabulary feedback.

Finally, the image is also passed to Gemini-2.5-flash to generate an alt text. This description contains all the information needed to answer the quiz questions, but it’s provided in the learner’s native language so that they still have to make the effort of translating and connecting the details. I’ve also made sure that any text in other languages is wrapped with the correct lang attribute, so screen readers pronounce them properly.


Would you like me to also add a short line in the β€œWhat I Built” intro (something like β€œThis way, the community is sharing the same daily challenge”) to highlight the communal aspect, or do you prefer to keep it focused just on the learner’s experience?

Top comments (4)

Collapse
 
pravesh_sudha_3c2b0c2b5e0 profile image
Pravesh Sudha

Great Project, fun way to test language proficiency!

Collapse
 
_bigblind profile image
Frederik πŸ‘¨β€πŸ’»βž‘οΈπŸŒ Creemers

Thx!

Collapse
 
glenn_trojan_1e79e881c2b7 profile image
Glenn Trojan

Looks great

Collapse
 
_bigblind profile image
Frederik πŸ‘¨β€πŸ’»βž‘οΈπŸŒ Creemers

Thx!

Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more