DEV Community

Manoj Kumar Pendem
Manoj Kumar Pendem

Posted on • Edited on

HealthMate – A Voice Agent That Thinks and Reasons Before Answering for Medical Awareness and Decision Support

AssemblyAI Voice Agents Challenge: Domain Expert

🩺 HealthMate – A Voice Agent That Thinks and Reasons Before Answering for Medical Awareness and Decision Support

Submission for AssemblyAI Voice Agents Challenge – July 2025

Cover Page


🚀 What is HealthMate?

HealthMate is more than a voice assistant—it’s your trusted health confidant that thinks and reasons like a clinician to deliver safe, reliable medical knowledge. Designed to empower everyone, especially underserved communities, it provides instant, ethical health guidance through voice interaction, bridging gaps in health literacy and accessibility.

  • 🎙️ Voice-First Queries: Ask health questions naturally, no typing required.
  • 🧠 Reasoned Responses: Simulates clinical reasoning for clear, evidence-based answers.
  • 🌍 Global Impact: Targets rural, low-literacy, and non-English-speaking users.
  • 🚨 Ethical Core: Never diagnoses, always escalates emergencies, and refers to professionals.

💡 Imagine a medical mentor in your pocket—available 24/7, powered by AI, and grounded in trust.


🎯 Why HealthMate?

The world is grappling with a health information crisis:

  • 3.6 billion people lack access to basic healthcare.
  • 90% of online health info is misleading or false.
  • 50% of rural areas remain underserved, with language and literacy barriers widening the gap.
  • Patients often delay care or self-medicate, risking lives.

HealthMate’s Mission: To democratize health literacy with voice-first, ethical, and accessible medical guidance, powered by AI that thinks and reasons before responding, ensuring safety and clarity for all.


Demos & Links


⚙️ How HealthMate Works

🧭 System Workflow

HealthMate’s brilliance lies in its ability to think and reason like a clinician, ensuring every response is safe, accurate, and helpful. Here’s the flow:

graph TD
  A[User Speaks Query] --> B[LiveKit: Streams Audio]
  B --> C[AssemblyAI: Speech-to-Text]
  C --> D[RAG + LLM Reasoning Engine]
  D --> E[ChromaDB: Vector Database]
  E --> F[Clinical Reasoning Layer]
  F --> G[Safe, Ethical Voice Response]
Enter fullscreen mode Exit fullscreen mode

Step-by-Step Breakdown

Step Component What It Does
1 LiveKit Captures, streams real-time voice from browser
2 AssemblyAI Converts speech to accurate, medical-aware text
3 RAG + LLM Interprets user query and retrieves clinical context
4 ChromaDB Performs vector search in curated medical knowledge base
5 Reasoning Simulates safe, step-wise clinical thinking
6 Voice Output Returns AI response with red flag checks and explanations

Tech Stack

Tech Use / Magic
LiveKit Realtime voice streaming (WebRTC, ~300ms latency)
AssemblyAI Universal-Streaming ASR tuned for accurate, medical speech input
Gemini / GPT Interprets clinical language and logic, ensures safe RAG-based output
ChromaDB Blazing fast vector search over reliable, curated medical data
FastAPI Python backend that handles core logic, API routing, and security
React + Tailwind Clean, responsive, and user-friendly frontend interface
Railway Effortless cloud deployment and auto-scaling for backend services
.env / Vercel Secures environment variables and config for safe deployment

Core Logic: LiveKit + AssemblyAI + Gemini LLM

Purpose:

Enable real-time voice streaming, detect end-of-speech (VAD), convert voice to text via AssemblyAI, and route the transcription to Gemini via FastAPI backend.

Key Components

# Set up LiveKit Voice Activity Detection + AssemblyAI
stt = assemblyai.STT(
    api_key=ASSEMBLYAI_API_KEY,
    end_of_turn_confidence_threshold=0.7,
    min_end_of_turn_silence_when_confident=160,
    max_turn_silence=2400,
)
Enter fullscreen mode Exit fullscreen mode
# Define LLM function to call FastAPI's /api/query
def webhook_llm_function(prompt: str) -> str:
    response = requests.post("http://localhost:8000/api/query", json={"query": prompt})
    return response.json().get("answer", "No answer received.")
Enter fullscreen mode Exit fullscreen mode
# LiveKit Agent Setup
llm = FunctionLLM(func=webhook_llm_function)

agent = Agent(
    name="HealthMate Voice Agent",
    session_factory=lambda: AgentSession(
        stt=stt,    # Real-time transcription
        llm=llm,    # Calls backend for Gemini response
        tts=None    # No text-to-speech (yet)
    ),
)
Enter fullscreen mode Exit fullscreen mode

Challenges Faced & How We Solved Them

Challenge Description Solution
Voice Cutoff Timing User speech was getting cut too early or too late. Tuned end_of_turn_confidence_threshold and silence timings in AssemblyAI.
Audio Sync Voice stream sometimes lagged between LiveKit and AssemblyAI. Optimized buffer settings and ensured proper threading in voice stream.
Slow LLM Response Gemini API responses created noticeable lags in conversation. Implemented loading states on frontend and added response caching.
CORS Errors Frontend couldn’t connect to FastAPI backend due to CORS policy blocks. Used fastapi.middleware.cors with permissive settings during dev.
API Key Leaks Accidentally committed .env with secrets. Added .env to .gitignore and rotated all leaked API keys immediately.

Screenshots & Demo Flow

Screen Preview
Home / Intro Home
How it Works Voice Activation
Ethical & Safe Guards Transcript
Impact Potential Reasoning
Voice Activation Emergency Trigger
AssemblyAI Transcript & LiveKit Safe Response
Clinical Reasoning & Output Mobile Screenshot

👥 Team

👨‍💻 Manoj Kumar Pendem

Solo builder, driven to bridge health gaps through voice-first AI solutions.

Built from scratch with 💪 and ☕ during the AssemblyAI Voice Agents Challenge.


🤝 Let’s Connect!

  • 🛠️ GitHub – Explore the code, file feedback, or contribute ideas
  • 🔗 LinkedIn – Let’s connect professionally
  • 🌍 Collaborations: Open to NGOs, health orgs, and language localization partners

🚀 Conclusion

HealthMate isn’t just another AI project—it’s a step toward making trusted health guidance accessible to every voice, everywhere.

Built with purpose, designed for impact. Let’s reimagine healthcare, one conversation at a time.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.