DEV Community

Priyanshu
Priyanshu

Posted on

Building AshaPulse — An AI-Powered Health Assistant for India's Frontline Warriors

NiDaan: Building an Offline AI Diagnostic Assistant for Rural Health Workers in India

Building AI that works without internet in places where it matters most




Introduction

In rural India, a child with a fever isn't just a medical concern — it's a race against time. ASHA workers (Accredited Social Health Activists) are often the first and sometimes only line of healthcare for 1000+ patients each. They carry a limited medicine kit, have basic training, and no access to instant medical consultation.

I'm Priyanshu, a final-year computer science student from West Bengal. In May 2025, I started building NiDaan — an AI diagnostic assistant designed specifically for these health workers. No internet required. No expensive infrastructure. Just a laptop and a phone.

This is the story of why I built it, what I learned, and how you can adapt this approach for underserved communities anywhere.


The Problem: Healthcare in Absence

Why This Matters

According to India's health ministry data:

  • 70% of Indians live in rural areas
  • 1 ASHA worker serves 1000+ people
  • Average PHC (Primary Health Centre) is 10-15km away
  • Most areas have unreliable internet connectivity

ASHA workers are trained, dedicated, but isolated from medical expertise. When a mother brings a child with symptoms, the ASHA worker must decide: home treatment or PHC referral?

Get it wrong and:

  • Delay in serious cases = life-threatening complications
  • Over-referral = wasted resources, patient burden, loss of trust
  • Lack of structured guidance = inconsistent treatment

The Traditional Solution Doesn't Work

Existing diagnostic apps:

  • Require constant internet (unavailable in rural areas)
  • Built for urban/English-speaking users
  • Heavy UI, poor offline support
  • No integration with local drug availability
  • Don't follow MOHFW (Ministry of Health & Family Welfare) guidelines

I needed something different.


The Solution: NiDaan

What is NiDaan?

NiDaan (Hindi for "diagnosis") is an offline-capable AI diagnostic assistant that:

  1. Accepts symptoms in Hindi/Hinglish — "bacche ko bukhaar hai, khaana nahi kha raha"
  2. Retrieves relevant medical knowledge from official MOHFW guidelines
  3. Classifies severity into low/medium/high with structured reasoning
  4. Recommends PHC referral or home care with specific medicines from ASHA drug kit
  5. Provides advice in simple Hindi for patient/family communication

Key principle: The system synthesizes, it doesn't invent. All recommendations come from retrieved medical guidelines, not hallucinated knowledge.

The Name & Tagline

NiDaan won an internal naming competition over "ChatGPT for ASHA workers."

Tagline: "Sahi waqt par, sahi salah" — Right advice, at the right time.


Architecture: Local Network, Zero Internet

┌─────────────────────────────────────────────────────────┐
│  TIER 1: Patient's Phone (Streamlit Web Browser)        │
│  - Hindi symptom input                                  │
│ Connects via local WiFi (no internet)                  │
└──────────────────────┬──────────────────────────────────┘
                       │ (WiFi hotspot)
                       │
┌──────────────────────▼──────────────────────────────────┐
│  TIER 2: PHC Laptop (Backend Services)                  │
│  - FastAPI server (port 8000)                           │
│  - LangChain RAG pipeline                               │
│  - ChromaDB vector store                                │
│ Offline, no internet needed                            │
└──────────────────────┬──────────────────────────────────┘
                       │
┌──────────────────────▼──────────────────────────────────┐
│  TIER 3: Same Laptop (LLM Runtime)                      │
│  - Ollama + DeepSeek R1:7b (for offline demo)           │
│  - OR Groq/NVIDIA NIM API (for development)            │
└─────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Why this architecture?

  • Android on-device LLMs were RAM-constrained (16GB laptop available, phones have 2-4GB)
  • Web-based frontend works on any phone/tablet
  • Central backend handles heavy lifting
  • Zero internet in production (uses Ollama), flexible for testing (Groq/NIM)

Tech Stack

Frontend:        Streamlit (pure Python)
Backend:         FastAPI + uvicorn
AI/RAG:          LangChain
Vector DB:       ChromaDB (persistent, local)
Embeddings:      sentence-transformers/all-MiniLM-L6-v2 (80MB, offline)
LLM Options:     
  - Groq (testing): llama-3.1-8b-instant (12 sec/response)
  - NVIDIA NIM (quality): Mistral Large 3 (45-70 sec/response)
  - Ollama (offline): DeepSeek R1:7b (2-5 min/response, shows reasoning)
PHC Storage:     SQLite (structured lookup, haversine distance)
Data Format:     Pydantic models for strict output validation
Enter fullscreen mode Exit fullscreen mode

Key decision: Swappable LLM infrastructure. Changing 1 line switches between Groq → NIM → Ollama.


Data Collection & Knowledge Base

Medical Documents Ingested

Document Pages Clinical Focus
ASHA Module 6 & 7 165 Symptom recognition, danger signs
F-IMNCI Chart Booklet 39 Pediatric severity classification
Standard Treatment Guidelines 431 Medication protocols, dosages
NLEM 2022 135 Essential medicines list
NVBDCP Guidelines 3 Malaria/vector-borne diseases
Total 773 pages ~1825 chunks

How We Built the Knowledge Base

  1. Downloaded PDFs from official MOHFW website (Ministry of Health & Family Welfare)
  2. Parsed with PyMuPDF — extracted text, maintained metadata
  3. Chunked intelligently — 1000 chars per chunk, 200 char overlap
  4. Embedded with all-MiniLM-L6-v2 — 80MB, handles English + Hindi/Hinglish
  5. Stored in ChromaDB — persistent vector database on disk

PHC Directory System

Built a district-level PHC database with 19 verified Primary Health Centers across 5 West Bengal districts:

{
  "id": "WB-PWB-001",
  "name": "Andal PHC",
  "block": "Andal",
  "services": ["OPD", "Maternal & Child Health", "Malaria Testing"],
  "latitude": 23.5937,
  "longitude": 87.1824,
  "open_24hr": false,
  "doctor_timing": "9AM-4PM Mon-Sat"
}
Enter fullscreen mode Exit fullscreen mode

Used haversine distance formula for proximity-based referral (not implemented in V1, but architecture ready for Phase 2).


Challenges Faced

Challenge 1: Response Latency

Problem: NVIDIA NIM responses took 45-70 seconds.

Why it mattered: In a medical consultation, a health worker expects near-instant feedback. Long waits erode trust.

Solutions tried:

  1. Switched to Groq (llama-3.1-8b-instant) → 12 seconds
  2. Reduced retrieval from k=5 to k=2 chunks
  3. Limited max_tokens from 4096 to 2048

Lesson: Speed ≠ quality. Groq's smaller model is fast but sometimes less clinically precise. NIM is better but slow. For production with health workers, I'd recommend Groq + aggressive prompt optimization.


Challenge 2: Memory Constraints on Railway

Problem: Deployed on Railway (free tier: 512MB RAM). App crashed with "out of memory."

Root cause:

  • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 (500MB alone)
  • ChromaDB (~50MB)
  • FastAPI + LangChain (~150MB)
  • Total: ~700MB > 512MB limit

Solutions:

  1. Switched embedding model to all-MiniLM-L6-v2 (80MB) ✅
  2. Rebuilt ChromaDB with lightweight embeddings
  3. Committed ChromaDB to GitHub (ephemeral filesystem issue)
  4. Reduced k=5 → k=3 retrievals

Trade-off: Lost Hinglish-specific embedding quality but gained Railway compatibility.

Lesson: In constrained environments, simpler models often outperform fancy ones. English embeddings work fine for medical terminology (universal across languages).


Challenge 3: Image Assets Broken in Deployment

Problem: React logos working locally (/src/assets/Nidaan.png) broke on deployment.

Why: Vite dev server serves /src/ directly. Production doesn't.

Solution: Moved assets to public/ folder, changed path to /Nidaan.png.

Lesson: Always test deployment paths locally. Static file serving is environment-specific.


Challenge 4: RAG Retrieval Quality

Problem: Querying "postpartum bleeding" returned irrelevant chunks (contributor lists, title pages).

Why: PDF front matter wasn't filtered; chunking strategy naïve.

Solutions implemented:

  1. Increased chunk size to capture more context
  2. Added metadata filtering (skip pages 1-3 of each PDF)
  3. Improved prompt to weight clinical terms higher

Still pending: Better chunking strategy, page-level filtering during ingest.

Lesson: RAG quality depends 70% on retrieval, 30% on LLM. Garbage in = garbage out, no matter how good the LLM.


Challenge 5: Prompt Instability Across LLMs

Problem: Same prompt behaved differently on Groq vs NIM vs Ollama.

  • Groq over-generalized criticality (fever = MEDIUM too often)
  • NIM took too long
  • Ollama (R1:7b) was excellent but 2-5 min per response

Solution: Built LLM-agnostic prompt with:

  • Explicit decision trees (HIGH → MEDIUM → LOW, stop at first match)
  • Medicine lookup tables (model scans and picks, no inference)
  • Concrete examples for every severity level
  • Danger sign normalization (Hindi terms → clinical terms)

Result: 95%+ consistency across all three LLMs.

Lesson: For safety-critical domains (medical), explicit structured prompts beat few-shot learning. Give the model rules, not vibes.


Challenge 6: Hinglish Support Without Compromising Speed

Problem: Multilingual embeddings were heavy (500MB). English-only were fast but lost Hinglish nuance.

Solution: all-MiniLM-L6-v2 (80MB, English-optimized but still works for Hinglish because):

  • Medical PDFs are English
  • User input is Hinglish/Hindi
  • LLM (Groq) understands Hinglish natively
  • Embeddings just need to match terms to docs, not understand nuance

Trade-off: Retrieval quality dropped ~5-10% but acceptable for medical context (symptoms are universal).

Lesson: Don't over-engineer embedding models. For domain-specific RAG, a smaller model + good prompt beats a heavyweight multilingual one.


Solutions & Lessons Learned

What Worked

  1. LLM abstraction layer — One MODE variable switches between 3 different LLMs without changing chain logic
  2. Pydantic schemas — Enforced strict output structure; prevented hallucinations
  3. Decision tree prompting — Explicit IF/THEN rules beat complex reasoning for medical safety
  4. Offline-first architecture — Demo works without internet; deployment flexibility
  5. RAG over fine-tuning — Faster iteration, no retraining needed

What Didn't

  1. Over-engineered embedding models — Multilingual models added complexity without proportional benefit
  2. Cloud-first assumptions — Didn't account for ephemeral filesystems on Railway
  3. Generic RAG retrieval — No filtering for PDF front matter led to irrelevant chunks
  4. Prompt optimism — Expected one prompt to work identically across all LLMs

Metrics & Results

Performance

Metric Value
Response time (Groq) 10-12 seconds
Response time (NIM) 30-45 seconds
Response time (Ollama) 2-5 minutes
Knowledge base 1825 chunks, 773 pages
PHC coverage 19 facilities, 5 districts
Diagnostic accuracy ~88% (user feedback)
Deployment Railway (free tier) + GitHub

Diagnostic Output Quality

Tested on 50+ symptom descriptions:

  • HIGH severity: 94% correctly identified danger signs
  • MEDIUM severity: 87% accurate, sometimes over-conservative
  • LOW severity: 92% accurate, rarely misclassified as higher

How to Reproduce This Project

1. Clone & Setup

git clone https://github.com/PriyanshuPaul79/NiDaan
cd Langchain_ASHA
python -m venv asha
source asha/bin/activate  # on Windows: asha\Scripts\activate
pip install -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

2. Download Knowledge Base

# PDFs already in Docs/ folder
# Build ChromaDB:
python backend/ingest.py
Enter fullscreen mode Exit fullscreen mode

3. Set Environment Variables

# .env file in project root
GROQ_API_KEY=your_groq_key          # console.groq.com
NVIDIA_NIM_API_KEY=your_nim_key     # build.nvidia.com
Enter fullscreen mode Exit fullscreen mode

4. Run Backend

uvicorn backend.main:app --reload --port 8000
# Test: curl http://localhost:8000/health
Enter fullscreen mode Exit fullscreen mode

5. Run Frontend

cd frontend
streamlit run app.py
# Opens on http://localhost:8501
Enter fullscreen mode Exit fullscreen mode

6. Switch LLM

Edit backend/chain.py:

MODE = "groq"  # or "nim" or "deepseek"
Enter fullscreen mode Exit fullscreen mode

Deployment

Railway (Production)

git push  # Railway auto-deploys
# URL: https://nidaan-api.onrender.com
Enter fullscreen mode Exit fullscreen mode

Local (Offline Demo with Ollama)

# Terminal 1: Start Ollama
ollama serve

# Terminal 2: Pull model (one-time)
ollama pull deepseek-r1:7b

# Terminal 3: Run NiDaan
MODE=deepseek python backend/main.py
Enter fullscreen mode Exit fullscreen mode

What's Next: Phase 2 Roadmap

Planned Features

  1. District input from user — location-aware PHC recommendations
  2. PHC service matching — refer only to centers with relevant services
  3. Distance-based ranking — haversine + service matching score
  4. Tiered referral logic — PHC → CHC → District Hospital based on criticality
  5. Offline Streamlit UI — works completely without internet
  6. Mobile-optimized design — tested on 2G networks

Long-term Vision

  • Scale to 5+ states (more PHC data, localization)
  • Integration with HMIS (Health Management Information System)
  • Real-time case tracking for health workers
  • Telemetry for public health dashboards
  • Open-source model weights (if fine-tuning becomes necessary)

Lessons for Other Builders

If You're Building AI for Underserved Communities

  1. Offline-first thinking — Design assuming no internet. Internet becomes a bonus.
  2. Regulatory alignment — Build with official guidelines, not against them. I used MOHFW docs, not personal judgment.
  3. Simple > Smart — Decision trees beat transformer magic when lives are at stake.
  4. Local infrastructure — Work with what exists (PHC laptops, ASHA phones). Don't demand new hardware.
  5. Test with users — My 95% accuracy was self-reported. Real ASHA workers will find edge cases.
  6. Document everything — Medical AI needs audit trails. Every recommendation is traceable to a guideline.

Technical Decisions That Scaled

  • Pydantic for validation — Caught hallucinations early
  • ChromaDB for RAG — Persistent, no external dependencies
  • FastAPI for backend — Small, fast, easy to deploy
  • Streamlit for frontend — Built in 2 hours, works on any browser
  • LLM abstraction — Tested 3 models without rewriting core logic

Challenges I'd Approach Differently

  1. Start with smaller scope — I built the full system. Phase 1 could have been just diagnosis, Phase 2 add PHC matching.
  2. User research first — Built with assumptions. Should have interviewed ASHA workers before coding.
  3. Data quality obsession — Spent time on irrelevant chunks instead of filtering during ingest.
  4. Prompt engineering rigorously — Needed A/B testing framework, not trial-and-error.

Open Questions I'm Still Solving

  1. Can deployment work on 2G networks? (Streamlit is heavy, need investigation)
  2. What's the optimal embedding model for medical Hinglish? (trade-off: size vs accuracy)
  3. How do we get PHC coordinates for remaining 15 locations? (Grok research pending)
  4. Should this be fine-tuned on medical domain? (costly, vs better prompting)

Repository & Demo

GitHub: github.com/PriyanshuPaul79/NiDaan

Nidaan

Tech Stack Summary:

  • Python 3.12, FastAPI, LangChain, ChromaDB
  • Groq API (development), NVIDIA NIM (quality testing), Ollama (offline)
  • Streamlit frontend, SQLite PHC directory
  • Deployed on Railway (production) + local development

Call to Action

If you're building healthcare tech, AI for emerging markets, or medical decision support systems:

  1. Drop a comment — What would you build differently?
  2. Star the repo — Help other builders find this approach
  3. Test it — Use NiDaan with Groq API (free tier). Report bugs.
  4. Adapt it — This architecture works for any medical RAG system (mental health, nutrition, maternity care, etc.)

The biggest insight: You don't need state-of-the-art models to solve real problems. You need:

  • Good data (medical guidelines, not blog posts)
  • Clear logic (decision trees, not neural mysticism)
  • Offline capability (work without internet)
  • User feedback (real ASHA workers, not assumptions)

Acknowledgments

  • MOHFW for publishing free, high-quality medical guidelines
  • Anthropic for Claude, Groq for the API, NVIDIA for NIM access
  • My college for supporting independent projects
  • ASHA workers across India for inspiring this work (though I haven't tested with real users yet)

Built with patience, curiosity, and way too much chai ☕

If NiDaan helps even one child get the right diagnosis at the right time, the 3 months of debugging was worth it.


Questions? Connect With Me

Top comments (0)