<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Manish Kotha</title>
    <description>The latest articles on DEV Community by Manish Kotha (@manish_kotha_0ceaa3fe05bb).</description>
    <link>https://dev.to/manish_kotha_0ceaa3fe05bb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3826445%2Fb6f604eb-0fc3-42de-95a0-273524ab2188.png</url>
      <title>DEV Community: Manish Kotha</title>
      <link>https://dev.to/manish_kotha_0ceaa3fe05bb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/manish_kotha_0ceaa3fe05bb"/>
    <language>en</language>
    <item>
      <title>How I Built an AI System to Reduce Healthcare No-Shows Using Flask, Random Forest &amp; SimPy</title>
      <dc:creator>Manish Kotha</dc:creator>
      <pubDate>Mon, 16 Mar 2026 06:11:58 +0000</pubDate>
      <link>https://dev.to/manish_kotha_0ceaa3fe05bb/how-i-built-an-ai-system-to-reduce-healthcare-no-shows-using-flask-random-forest-simpy-45k1</link>
      <guid>https://dev.to/manish_kotha_0ceaa3fe05bb/how-i-built-an-ai-system-to-reduce-healthcare-no-shows-using-flask-random-forest-simpy-45k1</guid>
      <description>&lt;h1&gt;
  
  
  How I Built an AI System to Reduce Healthcare No-Shows Using Flask, Random Forest &amp;amp; SimPy.
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;A walkthrough of my final year project — from problem statement to working simulation&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem I Wanted to Solve
&lt;/h2&gt;

&lt;p&gt;Anyone who has visited a clinic knows the frustration — long wait times, overbooked doctors, and yet somehow, empty slots because patients didn't show up.&lt;/p&gt;

&lt;p&gt;No-shows are one of the biggest inefficiencies in healthcare. Clinics lose revenue. Doctors waste time. Other patients who actually needed that slot couldn't get one.&lt;/p&gt;

&lt;p&gt;I wanted to build something that tackles this with a data-driven approach. The result: an &lt;strong&gt;AI-Based Healthcare Appointment Scheduling Optimization System&lt;/strong&gt; — my final year project built with Python, Flask, scikit-learn, and SimPy.&lt;/p&gt;

&lt;p&gt;Here's how I built it, what I learned, and what I'd do differently.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the System Does
&lt;/h2&gt;

&lt;p&gt;At its core, the system does three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Predicts&lt;/strong&gt; which patients are likely to miss their appointment (no-show prediction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Uses that prediction&lt;/strong&gt; to assign slots smartly (priority-based scheduling)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simulates&lt;/strong&gt; a full clinic day to prove the approach actually works (SimPy simulation)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are two portals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Patient Portal&lt;/strong&gt; where patients register, book appointments, and see their no-show risk&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;Admin Dashboard&lt;/strong&gt; where clinic staff manage doctors, generate slots, and run simulations&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Tech Stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;Python 3.11, Flask 3.0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;SQLite + SQLAlchemy ORM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Machine Learning&lt;/td&gt;
&lt;td&gt;scikit-learn (Random Forest)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Simulation&lt;/td&gt;
&lt;td&gt;SimPy (Discrete-Event)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Bootstrap 5, Chart.js&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Data&lt;/td&gt;
&lt;td&gt;pandas, numpy&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Part 1: The No-Show Predictor
&lt;/h2&gt;

&lt;p&gt;This is the heart of the project.&lt;/p&gt;

&lt;p&gt;I trained a &lt;strong&gt;Random Forest Classifier&lt;/strong&gt; to predict the probability that a patient will miss their appointment. The model outputs a score between 0 and 1, which I then bucket into three risk levels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LOW&lt;/strong&gt; — probability &amp;lt; 40%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MEDIUM&lt;/strong&gt; — probability between 40–70%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HIGH&lt;/strong&gt; — probability ≥ 70%&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Features used:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- previous_no_shows       (how many times they've missed before)
- days_until_appointment  (further away = higher risk)
- appointment_hour        (early morning slots have higher no-show rates)
- day_of_week             (Mondays and Fridays are worse)
- age
- gender
- reminder_sent           (did they get a reminder?)
- distance_km             (how far they live from the clinic)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Model config:
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;RandomForestClassifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;n_estimators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;max_depth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;class_weight&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;balanced&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;  &lt;span class="c1"&gt;# important — no-shows are a minority class
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I used &lt;code&gt;class_weight='balanced'&lt;/code&gt; because no-shows are naturally less common than shows. Without this, the model would just learn to predict "will show up" for everyone and get high accuracy while being useless.&lt;/p&gt;

&lt;h3&gt;
  
  
  Training data:
&lt;/h3&gt;

&lt;p&gt;I generated &lt;strong&gt;1,200 synthetic patient records&lt;/strong&gt; using a custom &lt;code&gt;generate_data.py&lt;/code&gt; script. Obviously, real hospital data would be better — but for a final year project, synthetic data with realistic distributions works well enough to demonstrate the concept.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 2: Priority-Based Slot Allocation
&lt;/h2&gt;

&lt;p&gt;Once I have the no-show probability, I use it to compute a &lt;strong&gt;priority score&lt;/strong&gt; for each booking request:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Score = 0.5 × (urgency / 5) + 0.3 × (wait_days / 30) + 0.2 × (1 - no_show_prob)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Breaking this down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Urgency (50% weight)&lt;/strong&gt; — a patient with a critical condition gets priority&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wait time (30% weight)&lt;/strong&gt; — patients waiting longer get bumped up&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability (20% weight)&lt;/strong&gt; — lower no-show probability = more trustworthy booking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system then assigns the highest-priority patient to the best available slot.&lt;/p&gt;

&lt;h3&gt;
  
  
  Overbooking Strategy
&lt;/h3&gt;

&lt;p&gt;This is where it gets interesting. Based on the risk tier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;HIGH risk (≥70%)&lt;/strong&gt;: The slot stays open after booking — another patient can fill it if needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MEDIUM risk (40–70%)&lt;/strong&gt;: Booked normally, but a reminder flag is set&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LOW risk (&amp;lt;40%)&lt;/strong&gt;: Normal booking, slot is closed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is a simplified version of how airlines overbook flights — except here, we're trying to ensure sick people actually get seen, not maximize revenue.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part 3: SimPy Simulation
&lt;/h2&gt;

&lt;p&gt;The ML model tells us &lt;em&gt;who&lt;/em&gt; is likely to no-show. But does the overall strategy actually improve clinic efficiency? That's where &lt;strong&gt;SimPy&lt;/strong&gt; comes in.&lt;/p&gt;

&lt;p&gt;SimPy is a Python library for &lt;strong&gt;discrete-event simulation&lt;/strong&gt;. I used it to simulate an entire 8-hour clinic day.&lt;/p&gt;

&lt;h3&gt;
  
  
  What the simulation models:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Patients arriving at scheduled times&lt;/li&gt;
&lt;li&gt;Doctors processing appointments (with variable duration)&lt;/li&gt;
&lt;li&gt;No-shows happening at a defined rate&lt;/li&gt;
&lt;li&gt;Queue buildup and wait times&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Comparing baseline vs. optimized:
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Baseline&lt;/th&gt;
&lt;th&gt;AI-Optimized&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;No-show rate&lt;/td&gt;
&lt;td&gt;25%&lt;/td&gt;
&lt;td&gt;~10% effective&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg wait time&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Doctor utilization&lt;/td&gt;
&lt;td&gt;Lower&lt;/td&gt;
&lt;td&gt;Higher&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Patients seen&lt;/td&gt;
&lt;td&gt;Fewer&lt;/td&gt;
&lt;td&gt;More&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The simulation confirms that the overbooking + priority strategy meaningfully improves throughput and reduces wasted slots.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;healthcare_scheduler/
├── app.py                    # Main Flask application
├── config.py
├── seed_db.py                # Populates DB with sample data
├── RUN_PROJECT.bat           # One-click Windows launcher
│
├── ai_modules/
│   ├── no_show_predictor.py  # Random Forest model
│   ├── scheduler.py          # Priority slot allocator
│   └── simulation.py         # SimPy simulation
│
├── models/                   # SQLAlchemy DB models
├── routes/                   # Flask API endpoints
├── templates/                # HTML templates
└── data/
    └── generate_data.py      # Synthetic dataset generator

## How to Run It Locally (Windows)

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
bash&lt;/p&gt;

&lt;h1&gt;
  
  
  Option 1: Just double-click RUN_PROJECT.bat
&lt;/h1&gt;

&lt;h1&gt;
  
  
  It handles everything automatically
&lt;/h1&gt;

&lt;h1&gt;
  
  
  Option 2: Manual
&lt;/h1&gt;

&lt;p&gt;python -m venv venv&lt;br&gt;
venv\Scripts\activate&lt;br&gt;
pip install -r requirements.txt&lt;br&gt;
python data\generate_data.py&lt;br&gt;
python ai_modules\no_show_predictor.py&lt;br&gt;
python seed_db.py&lt;br&gt;
python app.py&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;

Then open `http://127.0.0.1:5000` in your browser.
Demo credentials:
- Patient: `ravi@mail.com` / `pass123`
- Admin: `http://127.0.0.1:5000/admin/`

---
## What I Learned

1. The ML pipeline is the easy part.
Training the model took a few hours. Getting Flask, SQLAlchemy, and the ML model to work together cleanly took much longer. Integration is where real projects live.

2. Synthetic data has real limits.
My model performs well on my synthetic test set. Whether it would hold up on real patient data is a completely different question. Real-world class imbalance, missing values, and biases would make this much harder.

3. SimPy is underrated.
Most developers have never heard of discrete-event simulation. But for modeling anything with queues, arrivals, and service times — clinics, call centers, manufacturing lines — SimPy is incredibly powerful and worth learning.

4.`class_weight='balanced'` matters.
Before I added this, my model had 85% accuracy but was nearly useless — it just predicted "will show up" every time. Balanced class weights fixed this. Always check your class distribution before celebrating accuracy scores.

---
## What I'd Improve With More Time

- **Real dataset** — The [KaggleHealthcare No-Show dataset](https://www.kaggle.com/joniarroba/noshowappointments) has 110,000 real records. Training on that would make the model actually meaningful.
- **Cross-validation &amp;amp; hyperparameter tuning** — I used defaults mostly. GridSearchCV would squeeze more performance out of the model.
- **Better features** — Weather on appointment day, insurance type, appointment type (follow-up vs. new patient) are all predictive in research literature.
- **Deploy it** — Currently Windows-only. Dockerizing it and deploying to Render or Railway would make it actually accessible.
- **Send real reminders** — Right now the "reminder_sent" flag is manual. Integrating Twilio or email would make the overbooking strategy actually work end-to-end.

---
## GitHub

The full source code is here: **https://github.com/ManishKumar981/-healthcare-scheduler**

If you found this useful, a ⭐ on the repo goes a long way!
---
*Thanks for reading. If you have questions about the ML approach, the SimPy simulation, or the Flask architecture — drop them in the comments. Happy to discuss.*
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>python</category>
      <category>machinelearning</category>
      <category>beginners</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
