This is a submission for the Google AI Studio Multimodal Challenge
What I Built
I built a small tool that generates alt text for any image you upload. It looks simple on the surface, but it solves a real problem. A lot of people leave alt text empty because they do not know what to write. Others, including me, just throw in something random without thinking about how it affects people using screen readers. This tool takes that burden away. You upload an image, it analyzes it, and instantly gives you accurate alt text.
Demo
How I Used Google AI Studio
I used Google AI Studio to build the application. I dropped in my prompt and it built it right away. From there I made a few refinements. First, I changed the design because I did not like the first color choice. I also found the chat suggestions very useful. Then I refined the alt text generation logic to follow Accessibilitychecker.org checklist. It suggested things like adding image format validation and even creating an app icon, which made the app feel more complete.
Multimodal Features
The core multimodal feature is image understanding. The app processes an uploaded image and converts it into text that works as alt text. It shows off how well Gemini understands visuals and how it can describe them in a useful way. The fact that I can take any random image, drop it into the app, and instantly get an accessibility ready description is the real highlight.
Top comments (0)