DEV Community

Kinara
Kinara

Posted on

Vaani — Making Communication More Inclusive with Generative AI and Indian Sign Language

Vaani — Making Communication More Inclusive with Generative AI and Indian Sign Language

Introduction

As artificial intelligence continues to evolve, accessibility remains one of the areas where technology can create the most meaningful real-world impact. While modern communication tools have become increasingly advanced, many individuals still face barriers in participating fully in everyday conversations due to hearing or speech-related challenges.

To explore how Generative AI, speech recognition, and computer vision can improve communication accessibility, I built Vaani — an Android-based accessibility application focused on real-time speech transcription and Indian Sign Language (ISL) interaction.

Vaani combines AI-powered transcription, offline transcript management, notification-based accessibility controls, and MediaPipe-powered gesture interaction into a single platform designed to support more inclusive communication experiences.

The Problem

Communication accessibility is still a major challenge in:

  • classrooms,
  • public interactions,
  • meetings,
  • workplaces,
  • and daily conversations.

For individuals with hearing impairments, following spoken communication in real time can often be difficult. At the same time, support for Indian Sign Language remains limited across many mainstream digital platforms.

Existing accessibility tools are often:

  • expensive,
  • difficult to access,
  • dependent on continuous internet connectivity,
  • or lacking localized accessibility features.

I wanted to build a lightweight and practical solution that could use AI technologies to improve accessibility directly on mobile devices.

What is Vaani?

Vaani is an Android application that combines:

  • real-time speech transcription,
  • offline transcript storage,
  • Indian Sign Language interaction,
  • and accessibility-focused controls

to create a more inclusive communication system.

The application is designed to help users interact more effectively using AI-powered speech and gesture technologies while maintaining a simple and accessible mobile experience.

Key Features

🎤 Real-Time Speech Transcription

Vaani continuously converts spoken words into live text captions in real time.

This allows users to follow conversations more easily and improves accessibility during communication.

The application is designed to provide smooth and responsive transcription directly on Android devices.

💾 Offline Transcript Saving

One of the important features of Vaani is offline transcript management.

Users can:

  • save transcripts,
  • view recent conversations,
  • clear transcript history,
  • and access saved transcriptions locally on the device.

This enables accessibility support even in situations with limited or unstable internet connectivity.

🔔 Notification-Based Accessibility Controls

Vaani also includes notification-level accessibility controls for easier interaction.

Users can directly:

  • stop transcription,
  • mute transcription,
  • and manage accessibility sessions

through Android notifications without reopening the application repeatedly.

This improves usability and allows quicker access to important controls during live interactions.

🤟 Indian Sign Language (ISL) Interaction

Vaani includes an ISL interaction module powered by MediaPipe-based hand tracking and gesture processing.

Using real-time camera input, the application supports sign-based interaction and gesture visualization to improve accessibility-focused communication experiences.

The application also includes a Teach Sign feature for interactive sign demonstration and learning support.

Technologies Used

Android Development

  • Android Studio
  • Java
  • XML

Generative AI and AI Technologies

  • Speech Recognition APIs
  • Real-Time Transcription
  • Generative AI-assisted interaction workflows

Computer Vision

  • MediaPipe
  • Hand Tracking
  • Gesture Processing

Accessibility Features

  • Live Captions
  • Offline Transcript Storage
  • Notification Controls
  • ISL Interaction Support

How Vaani Works

  1. The application requests microphone and camera permissions.
  2. Speech input is captured in real time.
  3. Spoken audio is converted into live text captions.
  4. Users can save and manage transcripts locally.
  5. ISL mode enables gesture-based interaction using camera input.
  6. Notification controls allow quick accessibility management during active sessions.

Challenges Faced During Development

Developing Vaani involved multiple technical and design challenges, including:

  • handling continuous speech recognition efficiently,
  • reducing transcription latency,
  • optimizing Android background services,
  • integrating real-time camera processing,
  • and managing gesture interaction smoothly on mobile devices.

Balancing performance, accessibility, and usability while maintaining a lightweight Android experience was one of the most valuable parts of the development process.

Why This Project Matters

Accessibility technology should not be limited to specialized systems or expensive platforms.

Artificial intelligence and Generative AI have the potential to make communication more inclusive by helping users interact more naturally and efficiently.

Vaani represents an attempt to use:

  • Generative AI,
  • speech recognition,
  • computer vision,
  • and accessibility-focused design

to improve communication experiences for users who face communication barriers in daily life.

The project focuses not only on technical implementation, but also on creating practical and meaningful social impact through technology.

Future Improvements

Planned future enhancements for Vaani include:

  • multilingual transcription support,
  • improved ISL gesture recognition,
  • AI-powered conversation summarization,
  • offline gesture processing,
  • enhanced accessibility customization,
  • and broader support for inclusive communication workflows.

Conclusion

Vaani is an accessibility-focused Android application that demonstrates how Generative AI, speech recognition, and computer vision technologies can be combined to improve inclusive communication experiences.

By integrating real-time transcription, offline accessibility support, notification-based controls, and Indian Sign Language interaction into a single platform, the project explores practical ways AI can solve meaningful real-world accessibility challenges.

As AI continues to evolve, projects like Vaani highlight the importance of building technology that is not only intelligent, but also inclusive and accessible for everyone.

Project Links

GitHub Repository

https://github.com/Kinara2020/Vaani

Demo Video
https://youtu.be/0nIeDy_3_RM?si=HCxG2MNuqK-lstTL

Top comments (0)