<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Srimukh Vishnubotla</title>
    <description>The latest articles on DEV Community by Srimukh Vishnubotla (@srimukh_vishnubotla_77c92).</description>
    <link>https://dev.to/srimukh_vishnubotla_77c92</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3900188%2Ff611870b-f573-415b-b854-d7083f372480.png</url>
      <title>DEV Community: Srimukh Vishnubotla</title>
      <link>https://dev.to/srimukh_vishnubotla_77c92</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/srimukh_vishnubotla_77c92"/>
    <language>en</language>
    <item>
      <title>Building MedAI — An AI-Powered Disease Prediction &amp; Clinical Decision Support System</title>
      <dc:creator>Srimukh Vishnubotla</dc:creator>
      <pubDate>Mon, 27 Apr 2026 14:44:58 +0000</pubDate>
      <link>https://dev.to/srimukh_vishnubotla_77c92/building-medai-an-ai-powered-disease-prediction-clinical-decision-support-system-46l3</link>
      <guid>https://dev.to/srimukh_vishnubotla_77c92/building-medai-an-ai-powered-disease-prediction-clinical-decision-support-system-46l3</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewp3x8rmyxlvip81p353.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fewp3x8rmyxlvip81p353.png" alt="The MedAI dashboard — built entirely on open-source tools, running 100% on-device." width="800" height="438"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MedAI is a Flask-served Python application backed by MongoDB for clinical data persistence. The system combines RAG-powered disease retrieval across 14,000+ Human Disease Ontology entries, rule-based vital risk scoring, medical imaging analysis, and multi-turn AI chat — all running entirely on-device with no cloud dependency.&lt;/p&gt;




&lt;h2&gt;
  
  
  Team Members
&lt;/h2&gt;

&lt;p&gt;This project was developed by:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://dev.to/srimukh_vishnubotla_77c92"&gt;@v_srimukh&lt;/a&gt; — V. Srimukh&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/vamshidhar_reddymalgari_"&gt;@vamshidhar_reddy&lt;/a&gt; — M. Vamshidhar Reddy&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/narendhar_s_dcc253f6bbbdb"&gt;@s_narendhar&lt;/a&gt; — S. Narendhar&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/sai_pavansriramreddyal"&gt;@pavan_sri_ram&lt;/a&gt; — A. Pavan Sri Ram&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We would like to express our sincere gratitude to &lt;a href="https://dev.to/chanda_rajkumar"&gt;@chanda_rajkumar&lt;/a&gt; for their invaluable guidance and support throughout this project. Their insights into system design, architecture, and the RAG pipeline played a key role in shaping MedAI.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem We Set Out to Solve
&lt;/h2&gt;

&lt;p&gt;Clinical decision support tools are expensive. Epic, Cerner, UpToDate — the software most hospitals use comes with licensing fees that only large institutions can justify.&lt;/p&gt;

&lt;p&gt;When we sat down to plan our PFSD project, we kept coming back to one question: &lt;em&gt;what can four students actually ship in a semester, using only open-source tools?&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Turns out, quite a lot. A doctor or student should be able to enter patient vitals, symptoms, and history and get back a structured, grounded risk assessment — in under a minute, on their own laptop. No internet required. No external API calls. Nothing that breaks the moment a service goes down.&lt;/p&gt;




&lt;h2&gt;
  
  
  Our Solution
&lt;/h2&gt;

&lt;p&gt;We built &lt;strong&gt;MedAI&lt;/strong&gt; — a local-first clinical intelligence system. Enter patient vitals, symptoms, and medical history, and MedAI returns a structured risk assessment grounded in over &lt;strong&gt;14,000 standardised disease definitions&lt;/strong&gt; from the Human Disease Ontology, powered by a locally-running LLM via Ollama.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 &lt;strong&gt;The core insight: combining RAG with a structured ontology means the LLM never needs to hallucinate disease definitions — they're all already in &lt;code&gt;HumanDO.obo&lt;/code&gt;.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🟢 Diseases in knowledge base&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;14,000+&lt;/strong&gt; from Human Disease Ontology&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API endpoints&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;7&lt;/strong&gt; (including 2 streaming SSE)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Service modules&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;10&lt;/strong&gt; focused modules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Python dependencies&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;5&lt;/strong&gt; packages total&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyicwvwv43warsky6nuq9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fyicwvwv43warsky6nuq9.png" alt="The patient intake form — age, vitals, symptoms, and medical history all feed into one structured object" width="800" height="410"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Web layer:&lt;/strong&gt; Flask 3.x + Flask-CORS&lt;br&gt;&lt;br&gt;
&lt;strong&gt;LLM runtime:&lt;/strong&gt; Ollama (local) · &lt;code&gt;llama3.2&lt;/code&gt; (text) · &lt;code&gt;llava / llama3.2-vision&lt;/code&gt; (imaging)&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Knowledge base:&lt;/strong&gt; Human Disease Ontology · &lt;code&gt;HumanDO.obo&lt;/code&gt; — 14,000+ diseases, ICD codes, synonyms&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Database:&lt;/strong&gt; MongoDB · pymongo · 6 collections · TTL-indexed analytics cache&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Frontend:&lt;/strong&gt; Jinja2 templates · Dark/light theme · streaming SSE responses&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Dependencies:&lt;/strong&gt; Flask · Flask-Cors · requests · Pillow · pymongo — that's the entire tree&lt;/p&gt;


&lt;h2&gt;
  
  
  Why We Chose MongoDB
&lt;/h2&gt;

&lt;p&gt;When we started building MedAI, the biggest challenge wasn't the LLM integration — it was managing clinical data that doesn't fit neatly into rigid schemas. A patient assessment document contains vitals, a list of retrieved disease matches with scores, risk percentages, a full LLM-generated report, and metadata about which Ollama model was used. An imaging record looks completely different. A chat conversation is different again.&lt;/p&gt;

&lt;p&gt;With a SQL database, we'd be writing migrations every time we added a new field to the assessment schema. MongoDB let us store each record as a self-contained document that matches the shape of the data naturally — and because each collection is cleanly separated, the system is still easy to query and aggregate across.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💡 MongoDB is also entirely optional in MedAI. If it's not installed, the app starts cleanly and all assessment and chat endpoints still work — data just isn't persisted long-term. That decoupling made the dev loop significantly smoother, especially across machines with different setups.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We also used MongoDB's &lt;code&gt;expireAfterSeconds&lt;/code&gt; TTL indexing on the &lt;code&gt;analytics_cache&lt;/code&gt; collection for automatic cache cleanup — no cron jobs, no scheduled tasks, no manual scripts.&lt;/p&gt;


&lt;h2&gt;
  
  
  Application Architecture
&lt;/h2&gt;

&lt;p&gt;MedAI is a single Flask app — it serves the Jinja-rendered frontend and handles a REST/SSE API under &lt;code&gt;/api/*&lt;/code&gt;, all from one process. No microservices, no Kubernetes, no cloud infra to manage. The whole thing runs on a regular laptop without breaking a sweat.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"runtime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Flask 3.x + Jinja2 templates"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"llm_serving"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Ollama (local, port 11434)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"models"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"llama3.2 (recommended, ~2GB)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vision"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"llava / llama3.2-vision (optional)"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"knowledge_base"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HumanDO.obo — Human Disease Ontology (14,000+ entries)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mongodb_collections"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"patients"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"clinical_assessments"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"ai_conversations"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"reports"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"imaging_records"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"analytics_cache"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_api_endpoints"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"real_time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Server-Sent Events (SSE) for streaming assessment tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"deployment"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python app.py → http://localhost:5000/dashboard"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  MongoDB Data Model in MedAI
&lt;/h2&gt;

&lt;p&gt;MongoDB sits at the heart of MedAI's persistence layer because clinical data naturally arrives in different shapes. Patient assessments, imaging records, and conversations all have different fields. A rigid SQL design would require joined tables for a single assessment and would break every time we extended the schema.&lt;/p&gt;

&lt;p&gt;With MongoDB, each record stays a self-contained document, grouped into six collections for retrieval and analytics.&lt;/p&gt;

&lt;h3&gt;
  
  
  Collection 1 — clinical_assessments
&lt;/h3&gt;

&lt;p&gt;Every assessment — risk scores, RAG matches, and the LLM report — is stored as a single document after the stream completes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assessment_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"a3f2b1c4-70cf-423b-8e61-153d63756d43"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"patient_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pt-00142"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-07T11:32:09Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vitals"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"age"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;54&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"glucose"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;148&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"blood_pressure"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"142/91"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"temperature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;37.2&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chief_complaint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fatigue and increased thirst"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"risk_scores"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"diabetes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"hypertension"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rag_matches"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"doid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DOID:9352"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"type 2 diabetes mellitus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;27&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"icd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"E11"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"model_used"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"llama3.2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assessment_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Based on the presented vitals..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Collection 2 — imaging_records
&lt;/h3&gt;

&lt;p&gt;Each imaging analysis stores the modality type, model confidence scores, and triage urgency:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"record_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"img-00089"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"patient_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pt-00142"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-07T12:10:44Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modality"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"chest_xray"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vision_model"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"llava"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"confidence_scores"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Pneumonia"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.72&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Pleural effusion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Normal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.06&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"finding"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Opacity in right lower lobe consistent with consolidation."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"triage_urgency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Collection 3 — analytics_cache
&lt;/h3&gt;

&lt;p&gt;Analytics aggregations are cached with a TTL so MongoDB auto-expires stale entries — no cron jobs needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cache_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"dashboard_overview_2026-04-07"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"created_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-07T00:00:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"expires_at"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-08T00:00:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_assessments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;341&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"avg_risk_diabetes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;58.3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"top_conditions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"type 2 diabetes mellitus"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"essential hypertension"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Collections Summary:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Collection&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;patients&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Demographics and base patient records&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;clinical_assessments&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full assessment documents — vitals, risk scores, RAG matches, LLM report&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ai_conversations&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multi-turn chat history with DOID references&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;reports&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structured lab report analysis outputs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;imaging_records&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Vision model outputs — confidence scores, triage urgency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;analytics_cache&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;TTL-indexed aggregation cache for dashboard metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The RAG-to-LLM Pipeline
&lt;/h2&gt;

&lt;p&gt;How a full assessment works, step by step:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1 — Patient data ingested&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Age, vitals, symptoms, history, and chief complaint come in via &lt;code&gt;POST /api/assess&lt;/code&gt; or the web form. Each field is separate but feeds one structured object.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2 — RAG retrieval&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Symptoms become search terms. &lt;code&gt;rag_search()&lt;/code&gt; scans all 14,000+ OBO entries using a weighted keyword system and returns the top 8 disease matches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# rag_search — keyword scoring (api_routes.py)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;  &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt;   &lt;span class="c1"&gt;# exact word in name
&lt;/span&gt;    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;   &lt;span class="c1"&gt;# partial name hit
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;syns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;    &lt;span class="c1"&gt;# synonym match
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;defn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;    &lt;span class="c1"&gt;# definition match
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pars&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;          &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;    &lt;span class="c1"&gt;# parent category
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An exact name match earns 15 points. A synonym hit scores 8. A definition hit adds 4. Entries with both ICD codes and written definitions receive a quality bonus of 2 extra points.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3 — Rule-based risk scoring&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Before the LLM touches the data, &lt;code&gt;calculate_risk_scores()&lt;/code&gt; converts vitals into risk percentages. Glucose feeds a diabetes score; blood pressure feeds a hypertension score. Each caps at 95%.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4 — Prompt construction&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A structured 700–1100 word prompt is built with the ontology context, risk scores, and patient snapshot, then sent to Ollama.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 5 — Streamed response&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Tokens come back via &lt;code&gt;/api/assess/stream&lt;/code&gt; as SSE events. The UI renders them in real time and saves the completed assessment to MongoDB.&lt;/p&gt;


&lt;h2&gt;
  
  
  Key Features
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;🟢 Streaming clinical assessment&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Responses start showing up almost instantly — tokens stream to the browser via SSE as Ollama generates them. First words appear in under a second, so there's never a blank loading screen.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjvd7fvyueebc56zq9ua.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjvd7fvyueebc56zq9ua.png" alt="A live clinical assessment streaming in real time " width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔴 RAG disease search&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Keyword-scored retrieval across 14,000+ ontology entries, matching disease names, synonyms, ICD codes, and parent categories to surface the most clinically relevant results for each query.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🟡 Medical report analysis&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Paste any lab report or upload an image of one. The system picks out key findings, flags abnormal values, suggests possible conditions, and outlines next steps — structured output every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔵 Medical imaging pipeline&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Upload chest X-rays, CT scans, or MRIs. If &lt;code&gt;llava&lt;/code&gt; is installed, the vision model reads the image directly. If not, an ontology-backed text fallback kicks in automatically.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9cp585cdmxu1szy9l7p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb9cp585cdmxu1szy9l7p.png" alt="The imaging pipeline — upload a chest X-ray, CT, or MRI and get back condition confidence scores and a triage urgency level" width="800" height="533"&gt;&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;🟢 Risk scoring engine&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A rule-based system converts vitals into risk percentages before the LLM is involved at all. Fast, deterministic, and transparent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🔵 AI chat assistant&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Multi-turn medical Q&amp;amp;A that remembers your conversation and checks live ontology terms in real time. Every response includes sourced DOID references and an explicit AI disclaimer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3pjy4uucs2unxq0dp2ie.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3pjy4uucs2unxq0dp2ie.png" alt="Multi-turn medical Q&amp;amp;A — every response pulls live ontology terms, cites DOID references, and closes with an explicit AI disclaimer. No hallucinated disease definitions." width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  Ollama Integration &amp;amp; Resilience
&lt;/h2&gt;

&lt;p&gt;Working with a local LLM means dealing with all the ways it can quietly fail — the model might still be loading, the CPU might time out, or someone configured a model that isn't installed. Early on this burned us during testing, so we built three layers of resilience:&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;background probe thread&lt;/strong&gt; keeps checking Ollama's health every few seconds and updates a shared status dictionary. An &lt;strong&gt;exponential backoff retry loop&lt;/strong&gt; handles transient connection hiccups. A &lt;strong&gt;model fallback chain&lt;/strong&gt; tries &lt;code&gt;llama3.2 → llama3.1 → phi3:mini → phi3&lt;/code&gt; in sequence. And if everything falls apart, a deterministic &lt;code&gt;fallback_assessment()&lt;/code&gt; builds a structured report directly from the ontology matches — no LLM required.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Streaming SSE endpoint — /api/assess/stream
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# 1. Send metadata immediately (risk scores, RAG results)
&lt;/span&gt;    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="n"&gt;meta&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Stream tokens as they arrive from Ollama
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;stream_ollama&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;token&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Persist to MongoDB, signal completion
&lt;/span&gt;    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert_one&lt;/span&gt;&lt;span class="p"&gt;({...})&lt;/span&gt;
    &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;done&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  API Reference
&lt;/h2&gt;

&lt;p&gt;MedAI exposes seven endpoints. Two support streaming via Server-Sent Events:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/assess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full patient risk assessment (buffered). Returns risk scores, retrieved diseases, RAG reference count, and LLM summary.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;POST /api/assess/stream&lt;/code&gt; ⚡&lt;/td&gt;
&lt;td&gt;Same as above via SSE. Streams tokens one by one. Saves to MongoDB on completion.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/chat&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Multi-turn medical Q&amp;amp;A with conversation history, DOID references, and AI disclaimer.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/analyze-report&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Structured analysis of lab or clinical report text.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;POST /api/analyze-image&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Medical imaging analysis. Falls back to text analysis if no vision model is installed.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;GET /api/search-diseases&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Direct ontology search. Returns up to 50 scored matches with definitions, synonyms, ICD codes, and DOID identifiers.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Running MedAI Locally
&lt;/h2&gt;

&lt;p&gt;Getting it running takes four commands. Python 3.11+ is recommended. Ollama handles model serving over HTTP on port 11434. MongoDB is fully optional.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Install Python dependencies&lt;/span&gt;
pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="c"&gt;# 2. Start Ollama and pull a model&lt;/span&gt;
ollama serve
ollama pull llama3.2          &lt;span class="c"&gt;# ~2GB, recommended&lt;/span&gt;
ollama pull llava             &lt;span class="c"&gt;# optional: enables image analysis&lt;/span&gt;

&lt;span class="c"&gt;# 3. Run the Flask app&lt;/span&gt;
python app.py

&lt;span class="c"&gt;# 4. Open the dashboard&lt;/span&gt;
&lt;span class="c"&gt;# http://localhost:5000/dashboard&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Set &lt;code&gt;MONGO_ENABLED=false&lt;/code&gt; in your &lt;code&gt;.env&lt;/code&gt; file if MongoDB isn't installed — the app keeps running without it. To switch models, set &lt;code&gt;OLLAMA_MODEL&lt;/code&gt; in &lt;code&gt;.env&lt;/code&gt; to override the default &lt;code&gt;llama3.2&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Challenges We Faced
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Challenge 1: Ollama resilience&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
We underestimated how differently Ollama behaves across machines. On one machine the model loaded fine; on another it timed out halfway through a response. Getting the retry logic, fallback chain, and background probe working together reliably took a lot of iteration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge 2: Streaming + MongoDB writes&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
The SSE streaming endpoint was trickier than it looked — not the streaming itself, but making sure MongoDB writes happened cleanly after the stream completed without blocking the response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge 3: Vision model output parsing&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Different LLaVA versions format confidence scores differently. &lt;code&gt;parse_confidence_scores()&lt;/code&gt; ended up with four regex patterns and a keyword fallback before it was reliable across model versions. The keepalive thread (pinging the vision model every 4 minutes) was added after we noticed the first image request of a session was always slow — a cold-start issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Challenge 4: Team coordination&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Four people working on one codebase with strong opinions about route structure meant merge conflicts were a regular part of the process. We eventually settled on a rule: no one touches &lt;code&gt;api_routes.py&lt;/code&gt; and a service module in the same branch. That helped a lot.&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;The RAG engine is currently keyword-based. Moving to a hybrid dense + sparse approach — FAISS embeddings alongside BM25 — would meaningfully improve recall for unusual or rare symptoms that don't map cleanly to ontology terms. That's the highest-value technical improvement still on the table.&lt;/p&gt;

&lt;p&gt;The imaging pipeline is promising, but &lt;code&gt;llava&lt;/code&gt; on CPU alone is too slow for real clinical use. Enabling GPU support via &lt;code&gt;OLLAMA_NUM_GPU&lt;/code&gt; is the single biggest improvement we could make to the vision side — the pipeline is already built for it.&lt;/p&gt;

&lt;p&gt;The audit log and analytics cache infrastructure is already in place. Wiring the aggregation service to real patient data over time could make the risk trend and cohort charts genuinely useful for small clinics that can't afford enterprise analytics tools — which is exactly the kind of impact this project was always aiming for.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MedAI is a Flask-based Python full-stack project that uses MongoDB for clinical persistence — assessments, imaging records, and conversations — across six collections, with TTL-indexed analytics caching and a 14,000-entry Human Disease Ontology powering its RAG retrieval layer.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Built with Flask · Ollama · Human Disease Ontology · MongoDB · April 2026&lt;/em&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;Team: V. Srimukh, M. Vamshidhar Reddy, S. Narendhar, A. Pavan Sri Ram · Mentor: Chanda Raj Kumar Sir&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>rag</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
