DEV Community

Cover image for VisionVoice: Making Signs Speak for the Visually Impaired with Google AI Studio
VIKAS
VIKAS

Posted on

VisionVoice: Making Signs Speak for the Visually Impaired with Google AI Studio

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

I built VisionVoice โ€” Multilingual Visual Aid for the Visually Impaired, an applet designed to break language and accessibility barriers.

The app helps visually impaired users by detecting emergency/public signs, translating them into multiple languages, and narrating them aloud. This ensures safety and independence in real-world scenarios, like navigating public spaces or understanding critical instructions.

Demo

๐ŸŒ Live App: https://visionvoice-1073180550844.us-west1.run.app/
๐Ÿ”— GitHub Repo: https://github.com/vikasmukhiya1999/VisionVoice---Multilingual-Visual-Aid-for-the-Visually-Impaired
โ–ถ๏ธ Video Demo: https://youtu.be/N95jVdkpWbo

How I Used Google AI Studio

I leveraged Google AI Studio with Gemini 2.5 Flash Image to process multilingual visual inputs.

  • The model reads text from uploaded/real-time images.

  • Translates the detected text into the userโ€™s preferred language.

  • Converts translated text into audio narration, making it accessible for visually impaired users.

Multimodal Features

  • Image-to-Text Extraction: Captures emergency signs, directions, or public notices.

  • Text Translation: Supports multiple languages for global accessibility.

  • Text-to-Speech Narration: Gives voice output so users can understand without needing to read.

  • Mobile-First UI: Simple, modern, and accessible design optimized for quick use.

This combination of multimodal features transforms how visually impaired individuals interact with their environment, bridging accessibility and inclusivity.

Top comments (0)