sumit saraswat

Posted on May 15

I Built a Clinical AI Bias Detector With Zero Code Using MeDo

#ai #builtwithmedo #healthcare #nocode

I Discovered That a Cancer Prediction AI Was Using the Hospital Name — Not the Tumor — To Make Its Diagnosis. So I Built a Tool to Catch It.

Built for the Build with MeDo Hackathon · #BuiltWithMeDo

The Moment That Changed Everything

Last week I was studying how machine learning models are deployed in hospitals. I stumbled across a research paper that stopped me cold.

A cancer prediction model — the kind that tells doctors whether a patient needs aggressive treatment — was tested across multiple hospitals. It worked brilliantly at Hospital A. Terrible at Hospital B. Same cancer. Same biology. Different results.

Why?

The model had learned to use hospital_site_id as its strongest predictor. Not tumor size. Not mitotic count. Not any clinical feature that a doctor would recognize. The AI had essentially memorized which hospital a patient came from and used that to predict cancer outcomes.

This is called proxy bias, and it's one of the most dangerous failure modes in clinical AI. The model looks accurate during training because hospital location correlates with outcomes (richer hospitals → better outcomes → lower risk scores). But it's not doing medicine. It's doing geography.

And nobody caught it — because the model was a black box.

"What If I Could Build a Tool That Catches This Instantly?"

That question became my hackathon project. I wanted to build something that:

Takes any clinical ML model's prediction output
Runs real mathematical analysis on which features the model actually used
Flags the moment a non-clinical feature (like hospital ID or timestamp) dominates the prediction
Gives clinical teams a clear, actionable compliance workflow

The tool needed to be more than a calculator. It needed to be something a hospital data science team would open every morning. A compliance operations platform.

Enter MeDo

I'd never built a full-stack app from scratch before. Six pages, user authentication, database persistence, an AI chat interface, drag-and-drop Kanban boards — that's a week of work for an experienced developer.

With MeDo, I described what I wanted in plain English, and it scaffolded the entire thing.

Here's what I told MeDo in my first prompt:

"Create a sleek, modern, full-stack medical audit dashboard with a dark mode futuristic clinical aesthetic. Include a sidebar navigation, a large JSON file drag-and-drop upload container, and a dynamic audit results section."

MeDo generated the React frontend, the backend, the database models, and the routing — all in one shot. No boilerplate. No config files. No dependency hell.

The Math That Makes It Real

Here's where it gets technical. I didn't want a fake "AI score." I wanted real game-theoretic feature attribution — the same mathematical framework behind SHAP (Shapley Additive exPlanations), which is the gold standard for ML interpretability.

The core logic:

Step 1: Normalization. When a model outputs feature weights, they don't naturally sum to the prediction. My audit engine normalizes each feature's contribution so that:

Σ all_contributions = prediction - base_value

This mirrors exact Shapley value semantics — every feature's "credit" adds up to explain exactly why the model moved from its baseline.

Step 2: Proxy Detection. The engine maintains a list of known non-clinical proxy features (hospital_site_id, system_timestamp, zip_code, etc.). If any proxy feature contributes more than 15% of the total prediction variance, the system triggers:

⚠ PROXY_BIAS_DETECTED

Step 3: Trust Index. A 0-100 score calculated from two penalty components:

Proxy Contamination Penalty: Sum of all proxy feature variance (capped at 60 points)
Skew Penalty: Number of clinical features with outsized influence (5 points each, capped at 20)

A Trust Index below 50 means the model should not be deployed.

I described this math to MeDo in plain English, and it generated a working implementation. That still blows my mind.

The "Holy Shit" Demo

When I uploaded my first test payload — a simulated cancer prediction model with a deliberately leaked hospital_site_id — the dashboard lit up like a Christmas tree. In red.

The results:

🔴 Raw Prediction Risk: 91.0%
🔴 Trust Index: 46/100
🔴 Status: PROXY_BIAS_DETECTED

The feature attribution table revealed the terrifying truth: hospital_site_id accounted for 39% of the prediction. The actual tumor size? Only 17.9%.

The model was predicting cancer based on where the patient was treated, not what their cancer looked like.

Then I uploaded a clean model — one trained only on clinical features. Everything turned green. Trust Index: 95/100. Status: CLEAN. The contrast was visceral.

Beyond a Calculator: The Compliance Workflow

A one-shot audit tool isn't a productivity platform. Here's what makes MeDo Audit a real product:

📊 Trust Index Trend Chart

Every audit is tracked over time. The dashboard shows a line chart of Trust Index scores — green above 75, amber between 50-75, red below 50. At a glance, a compliance officer can see if a model is degrading.

🔄 Compare View

Select any two audits and see them side-by-side. Trust Index 46 vs 95. PROXY_BIAS_DETECTED vs CLEAN. Top features ranked. This is how teams track whether their model retraining actually fixed the bias.

🎫 Compliance Hub (Kanban Board)

When an audit fails — Trust Index below 50 or proxy bias detected — the system automatically creates a compliance ticket. The Kanban board has three columns: Open → Under Review → Resolved. Teams can drag tickets, add investigation notes, and track remediation.

🤖 AI Audit Explainer

This is the feature I'm most proud of. After an audit runs, an AI chat panel appears. But it's not a generic chatbot — it has the full audit context. Ask it "What should we fix?" and it responds:

"Remove hospital_site_id and system_timestamp from the feature set and retrain the model using only clinical variables. hospital_site_id alone accounts for 39.0% of the model's decision — a non-clinical identifier should never be a top driver."

That's not a template. That's a context-aware AI clinical data scientist, built into a no-code platform.

Why This Matters Right Now

The EU AI Act (effective 2025) classifies clinical AI as "high-risk" and requires transparency documentation for deployed models. The FDA has issued draft guidance demanding that AI/ML-based medical devices provide feature-level interpretability.

There is no widely available, easy-to-use tool that gives clinical teams this workflow. Enterprise solutions cost six figures. Open-source SHAP libraries require Python expertise and can't be deployed as a team workflow tool.

MeDo Audit fills that gap. Built with MeDo. Deployed with one click. Zero code.

Try It Yourself

🔗 Live App: https://app-bnht1hj8irk1.appmedo.com/

Quick test:

Click "Load Sample" or upload a JSON payload
Watch the dashboard light up with audit results
Ask the AI: "Is this model safe to deploy?"
Check the Compliance Hub for auto-generated tickets

The sample data is pre-loaded with a biased model. Upload it and watch everything go red. That red screen is the difference between catching bias before deployment and letting a discriminatory AI make cancer diagnoses.

What I Learned

No-code doesn't mean no-depth. MeDo handled full-stack generation, but the mathematical logic (SHAP normalization, proxy detection, trust scoring) was my design. The platform amplified my domain knowledge into a deployable product.
The scariest bugs aren't in code — they're in data. A model can be 95% accurate and still be fundamentally biased. Accuracy metrics don't catch proxy bias. Feature attribution does.
AI transparency isn't optional anymore. It's law. And the tools need to be as accessible as the models they audit.

Built for the Build with MeDo Hackathon · Try MeDo at medo.dev

#BuiltWithMeDo #AI #Healthcare #MachineLearning #NoCode #Hackathon

DEV Community