Day 23 Building GoDavaii: Why Language Barriers Aren't Just Translation Problems in Health AI

#healthtech #ai #buildinpublic #startup

GPT-4's Hindi output is barely functional for nuanced medical queries. Ask any native speaker. That's a problem, especially when a child's medicine dosage depends on cultural context, not just direct translation. I'm Pururva Agarwal, founder of GoDavaii, and on Day 23 of our public sprint, I'm thinking about the hidden complexities that make health AI for India fundamentally different.

My cousin's newborn had a persistent cough. The local pharmacy provided a cough syrup. What wasn't immediately obvious, and wasn't flagged, was that it was an adult formulation. This wasn't a malicious error, but a system gap - a lack of a universal 'second pair of eyes' that could account for age, context, and language.

It's this blend of individual context and systemic oversight that drives GoDavaii. We're building India's Advanced Health AI, not just a chatbot, but a comprehensive platform with an AI Health Chat in 22+ Indian languages, a robust Drug Interaction Checker, AI-verified Desi Ilaaj, and more. And the core technical challenge isn't just about parsing medical terms; it's about understanding health in a way that respects diverse linguistic and cultural realities.

The "Next Billion" Speaks a Different AI Language

When we talk about the 'next billion' coming online, we're not just talking about more users. We're talking about users who primarily communicate in their mother tongue. English-first AI models, even the frontier ones, often stumble here. Their training data skews heavily towards English, leading to:

Semantic Drift: A phrase like "ang dukhte" in Tamil isn't just "not feeling well"; it carries nuances of vague discomfort that a generic sentiment analysis might miss. Our AI Health Chat needs to interpret these specific, often colloquial, symptom descriptions accurately. This requires dedicated language models and extensive, context-rich datasets in each of our 22+ languages.
Medical Terminology Discrepancies: A single ailment can have multiple names across different Indian languages, not to mention the blend of English, Hindi, and local terms within a single sentence (Hinglish, Tanglish, etc.). Our architecture uses a multi-layered NLP approach, combining transformer models fine-tuned on medical texts from India, alongside a carefully curated medical ontology that maps these variations.
Voice-First UX Challenges: Many of our users will interact through voice. Building robust voice-to-text for medical queries in diverse Indian accents and dialects is a monumental task. We're experimenting with open-source speech-to-text models like Whisper, but with heavy post-processing and domain-specific language model adaptation to ensure medical accuracy. This is where a lot of our current research and development effort is focused.

AI-Verified Desi Ilaaj: Bridging Centuries of Knowledge

This is perhaps our most unique and technically challenging moat. Integrating traditional Indian home remedies (Desi Ilaaj) and Ayurvedic practices with modern allopathic medicine isn't just about translation; it's about cross-verification. How do you, as an AI, reconcile the concept of 'cooling' herbs with pharmaceutical drug interactions?

Our approach involves building a sophisticated knowledge graph that links:

Ayurvedic/Traditional Ingredients: Their known properties, potential side effects, and traditional uses.
Allopathic Medicines: Active compounds, mechanisms of action, and established drug interactions.
Scientific Literature: Studies (where available) on traditional remedies, and their chemical interactions.

The 'AI-verification' isn't about validating efficacy (that's for clinical trials) but flagging potential conflicts. Does this Ayurvedic remedy increase the sedative effect of an allopathic drug? Could it interfere with absorption? This demands a reasoning engine capable of complex inference across disparate data sources - a data science nightmare, but a critical need for holistic family health in India.

Building in Public, Day 23: The Long Game

We're Day 23 of a 30-day sprint, publicly tracking our progress. Our current user count is 0, as we're focusing intensely on refining the core architecture and ensuring safety and accuracy before a wider launch. This honesty might seem counterintuitive to some, but to this Dev.to community, I know you appreciate transparency.

Placing as a Top 14 Global Finalist at Startup Flight Vietnam 2025 gave us valuable external validation for the vision, but the real work is in the trenches, wrestling with data quality, model biases, and the sheer complexity of language at scale.

We're not just building a product; we're trying to build a new standard for accessible, context-aware health AI. It's a preparation tool for families, designed to help them ask more precise questions of their doctors and navigate their health journey with more information.

Try GoDavaii at godavaii.com - curious what this community thinks about the unique challenges of building AI for truly multilingual, multi-system health contexts.