DEV Community

Cover image for What happens when the AI trained to save lives was never trained on yours?
Temiloluwa Valentine
Temiloluwa Valentine Subscriber

Posted on

What happens when the AI trained to save lives was never trained on yours?

Gemma 4 Challenge: Build With Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

Growing up in Nigeria, malaria was everywhere. People missed school because of it. Families spent nights in hospitals because of it. Yet most healthcare AI systems today are trained on datasets that barely represent African patients.

Fisibel is a multimodal African synthetic health data infrastructure platform powered by Google Gemma 4. Upload any health record image. Gemma 4 reads it, extracts clinical patterns, and generates privacy-safe synthetic datasets grounded in real WHO and World Bank statistics, with full OpenMetadata governance and fidelity scoring.

No real patient data ever leaves the platform. Only privacy-safe synthetic equivalents that preserve the statistical reality of African health.

Generation Page

The pipeline:

  • Upload a real health record image
  • Gemma 4 reads it multimodally and extracts clinical signals
  • WHO + World Bank APIs ground the generation in real African statistics
  • Synthetic dataset generated with 94% fidelity score
  • Automatically registered in OpenMetadata with full lineage tracking
  • Data Quality analysis before training

Demo

Check Live Demo

Code

GitHub: https://github.com/Valentinetemi/fisibel

How I Used Gemma 4

I chose Gemma 4 26B MoE (gemma-4-26b-a4b-it) for 5 specific reasons:

1. Multimodal understanding for health records
Fisibel's core feature is uploading a real hospital health record image and having Gemma 4 read it directly. This is Layer 1 of the pipeline — Multimodal Ingestion. The image is compressed client-side and converted to base64. Gemma 4 processes it natively, extracting diagnoses, symptoms, treatments, geographic identifiers, and clinical patterns directly from the document without any OCR preprocessing. Gemma 4 does not just generate data; it first reads and understands real clinical documents.

2. Statistical grounding before generation
Before Gemma 4 generates a single row, Fisibel fetches live African health indicators from the WHO Global Health Observatory API and World Bank Open Data API. This is Layer 2 — Statistical Grounding. Gemma 4 uses these verified statistics as ground-truth constraints, ensuring every synthetic row aligns with real African population distributions rather than invented patterns.

Live Dataset Generation

Scientific Validation Mirror
During generation, Fisibel displays a live Scientific Validation Mirror, a real-time chart comparing the synthetic dataset's age distribution against the WHO/World Bank baseline. The blue line represents Fisibel's synthetic output. The grey line represents the real-world WHO statistical baseline. When they align closely, the data is statistically grounded. This gives researchers immediate visual confidence in their data before they even download it.

Scientific Validation Mirror

3. Synthetic generation with clinical logic
In Layer 3, Gemma 4 combines the extracted clinical patterns with the WHO and World Bank distributions to generate synthetic CSV datasets. It enforces strict categorical consistency, realistic age distributions, geographically accurate Nigerian LGAs, and clinically coherent symptom-treatment pairs.

4. Fidelity scoring
Every generated dataset is scored 0–100 using a two-layer algorithm before it reaches any training pipeline.

Layer 1 — Gemma 4 AI Evaluation (80% of score)

Gemma 4 evaluates 50 randomly sampled rows against 4 criteria:

Criterion Weight What It Checks
Feature relationship coherence 40 pts Do the columns make sense together?
Realistic distributions 30 pts Are values varied, not uniform or random?
Risk factor consistency 20 pts Do risk indicators align logically?
Logical coherence 10 pts Are there contradictions across rows?

Layer 2 — Completeness Check (20% of score)

A statistical completeness function scans every cell in the full dataset:

function getCompleteness(csv: string) {
  const rows = csv.split('\n').filter(Boolean)
  let total = 0
  let filled = 0
  rows.forEach(r => {
    r.split(',').forEach(cell => {
      total++
      if (cell.trim() !== '') filled++
    })
  })
  return (filled / total) * 100
}
Enter fullscreen mode Exit fullscreen mode

Final Score Formula:

finalScore = (0.8 × GeminiScore) + (0.2 × completenessScore)
Enter fullscreen mode Exit fullscreen mode

This penalizes missing values even when AI-evaluated rows look good. A dataset with beautiful clinical patterns but 30% empty cells will not score above 86.

Catalog page

The Lagos malaria dataset achieved 94%.

5. AI-powered data quality recommendations
Before any dataset is handed to a model, it passes through a dedicated quality layer that produces a Model Readiness Score (0–100). Every penalty is calculated explicitly — no black box.

Model Readiness Score Breakdown

The score starts at 100 and deducts penalties across five dimensions:

Penalty Max Deduction Logic
Missing values 40 pts Scales with avg missing % across all columns
Duplicate rows 30 pts Scales with % of duplicate rows in dataset
Low row count 20 pts < 100 rows = −20, 100–999 rows = −10
PII exposure variable Scales with % of columns flagged as PII
Inconsistent data 20 pts Columns with > 50% missing or high internal duplicates
modelReadinessScore = max(0, 100 − totalPenalty)
Enter fullscreen mode Exit fullscreen mode

Per-Column Analysis

Every column is individually profiled:

// Data type detection — strict matching, not guessing
function detectDataType(values: any[]): string {
  // boolean: exact true/false/yes/no match only
  if (/^(true|false|yes|no)$/.test(str))  'boolean'

  // numeric: must be a valid number, not just contain digits
  if (!isNaN(Number(str)) && str !== '')  'numeric'

  // date: standard date pattern matching
  if (/^\d{1,2}[-/]\d{1,2}[-/]\d{2,4}$/.test(str))  'date'

  // everything else
  return 'text'
}
Enter fullscreen mode Exit fullscreen mode

For numeric columns, Fisibel computes:

  • Mean, median, standard deviation
  • Min / max range
  • Outlier count using the IQR method (values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR)

PII Detection

Every column header is scanned against a pattern library. Columns flagged as PII (names, phone numbers, national IDs, addresses) are surfaced as warnings and factored into the readiness penalty — so sensitive data never silently reaches a training pipeline.

Duplicate Detection

Row-level deduplication using JSON serialization:

function findDuplicates(rows: any[]) {
  const rowStrings = rows.map(row => JSON.stringify(row))
  const unique = new Set(rowStrings)
  const duplicateCount = rows.length - unique.size
  return {
    duplicateCount,
    duplicatePercentage: (duplicateCount / rows.length) * 100
  }
}
Enter fullscreen mode Exit fullscreen mode

Readiness Verdict

Score Status
80–100 ✅ Ready for model training
60–79 ⚠️ Some data quality issues — review before training
0–59 ✗ Significant issues — address recommendations first

Why MoE efficiency matters
The 26B MoE architecture activates only ~4B parameters during inference. This matters for African healthcare researchers who often work in resource-constrained environments. Frontier-level clinical reasoning at near-edge efficiency is exactly right for this problem.

What Gemma 4 unlocked:

  • Direct reading of Nigerian hospital health records as images
  • Clinically coherent synthetic data generation grounded in the WHO health observatory and the World Bank statistics
  • Fidelity scoring — Gemma 4 evaluates its own outputs against real population distributions
  • AI-powered data quality recommendations before training

Data quality page

Data analyzer

OpenMetadata Integration

Fisibel uses OpenMetadata as its core governance layer via the REST API.

Every generated dataset is automatically:

  • Registered as a Table Entity under the fisibel-synthetic Database Service
  • Given full column schema with data types and descriptions
  • Tagged with Tier3 governance tag
  • Stored with a complete description including domain, country, fidelity score, and the exact prompt used to generate it
  • Tracked with lineage — the Gemma 4 Generator pipeline is registered as the source, with an edge connecting it to every output table
  • Discoverable in the Fisibel Catalog which pulls live from the OpenMetadata API

Entity Structure

Database Service: fisibel-synthetic (CustomDatabase)
  └── Database: default
        └── Schema: synthetic_datasets
              ├── healthcare_dataset__nigeria
              └── ...

Pipeline Service: fisibel-pipelines (CustomPipeline)
  └── Pipeline: gemma-generator
        ├── → healthcare_dataset__nigeria  (Lineage Edge)
        └── ...
Enter fullscreen mode Exit fullscreen mode

Dataset stored in openmetadata

📈 Results

  • 500 synthetic patient records generated from a single Lagos hospital record
  • 94% fidelity score against WHO population distributions
  • Zero PII exposure — no real patient data stored or transmitted
  • Full OpenMetadata lineage tracking from source record to output dataset

Why It Matters
Behind every missing dataset is a real person:
a child misdiagnosed,
a treatment delayed,
a patient overlooked.

Fisibel exists so African patients are no longer invisible to the AI systems increasingly shaping modern healthcare. Systems built to detect disease often show reduced performance on underrepresented populations.

African patients deserve to be seen by the AI systems built to protect them. Fisibel is where that starts.

Top comments (0)