The Problem
Doctors in India spend 2–3 hours daily on clinical documentation. Most EMR systems are glorified data entry forms that add to the burden rather than reduce it.
We built MedScribe — an AI-powered ambient clinical documentation tool that listens to doctor-patient conversations and generates structured SOAP notes, ICD-10 codes, and prescriptions in real time.
The Tech Stack
- Frontend: React JS
- Database: PostgreSQL 18 with a headless CMS layer
- AI Pipeline: Multilingual ASR → Medical NER → LLM structured output
- Deployment: Azure App Service (web) + on-premise GPU nodes (inference)
Key Technical Challenges
1. Multilingual Voice Capture
Indian doctors code-switch between English, Hindi, and regional languages mid-sentence. Off-the-shelf ASR models fail here. We fine-tuned Whisper on 500+ hours of Indian clinical audio with domain-specific vocabulary.
2. Structured Output from Messy Conversations
A 10-minute consultation produces unstructured dialogue. The LLM pipeline extracts:
- Chief complaint & history of present illness
- Differential diagnosis
- SOAP note sections
- ICD-10 codes
- Prescriptions with dosage
We use constrained decoding + JSON schema validation to ensure output is always machine-parseable.
3. Privacy-First Architecture
Healthcare data can't leave the clinic network. Our inference runs on-premise with zero data retention — no patient audio or text hits external servers. The EMR syncs only anonymized metadata for analytics.
4. ABDM Compliance
India's Ayushman Bharat Digital Mission requires specific health record formats (FHIR bundles), ABHA ID linking, and consent management. We built a middleware layer that translates our internal models to ABDM-compliant payloads.
Results
- Documentation time reduced from 15 min → 2 min per patient
- 95%+ accuracy on ICD-10 coding for common specialties
- Works offline — no internet dependency during consultations
What I Learned
- Healthcare AI is a regulatory problem first, ML problem second. Get compliance right before optimizing accuracy.
- Doctors won't change workflows. Build around how they already work, not how you think they should.
- On-premise isn't dead. When data sensitivity is non-negotiable, cloud-only is a dealbreaker.
We're building this at VivaLyn Labs. If you're working on healthcare AI or building for the Indian market, I'd love to connect.
What's the hardest compliance/privacy challenge you've faced while building AI products?
Top comments (0)