DEV Community

Monika Sadlok
Monika Sadlok

Posted on

MedScan Assistant — AI medication label reader for seniors, powered by Gemma 4

This is a submission for the Gemma 4 Challenge: Build with Gemma 4

What I Built

MedScan Assistant helps elderly and visually impaired people understand their medication labels. Half of all medication errors happen because patients misread or misunderstand the label — especially people with low vision or cognitive decline.

Point your phone at any medication label → Gemma 4 reads and interprets it → the app reads the result aloud in plain language.

Two markets supported:

  • 🇺🇸 US Market — FDA Drug Facts panels, OTC/Rx labels
  • 🇵🇱 Polish Market — EU-format labels, Polish TTS

Key features:

  • 📷 Photo scan or manual text input
  • 🔊 Automatic voice readout (Web Speech API, language-matched)
  • 📅 Color-coded expiry status (green / amber / red)
  • ⚠️ Warnings and drug interactions
  • 🏥 "See a doctor if..." — symptoms requiring medical attention
  • ♿ WCAG 2.1 AA accessible (large buttons, aria-live, keyboard nav)

GitHub: https://github.com/monsad/medscan-assistant
Live demo: https://monsad.github.io/medscan-assistant

How I Used Gemma 4

Model: google/gemma-4-31b-it:free via OpenRouter

I chose Gemma 4 31B Dense for three specific reasons:

1. Native multimodal input
Gemma 4 processes label photos directly — no separate OCR step. Medication labels have irregular layouts, rotated text, and small fonts. The model understands the full visual context, not just extracted characters.

2. Structured medical reasoning
The model converts complex pharmaceutical terminology into plain-language JSON. Acidum acetylsalicylicum 500mg becomes "Aspirin — pain reliever and fever reducer." I prompt it to return a strict JSON schema with fields for brand_name, directions, expiry_status, warnings, see_doctor (symptoms requiring a doctor visit), and voice_text for TTS readout.

3. Multilingual in a single model
The same model handles both FDA Drug Facts panels (English) and Polish EU-format labels — switching via market-aware system prompts. No translation APIs, no separate models.

Why 31B Dense over E4B:
For a patient safety application, accuracy matters more than latency. The 31B Dense model gives noticeably better results on complex dosing instructions, drug interaction identification, and expiry date parsing across formats (EXP 04/26 vs 04/2026 vs APR 2026).

const MARKETS = {
  us: { systemPrompt: "You are a medication assistant for Americans...", ttsLang: "en-US" },
  pl: { systemPrompt: "Jesteś aptecznym asystentem dla seniorów...",   ttsLang: "pl-PL" }
};
// One model. Two prompts. Two languages.
Enter fullscreen mode Exit fullscreen mode

What I Learned

The hardest part was making Gemma 4 reliable enough for medical use. I use temperature: 0.2 and a strict JSON schema. The model occasionally wraps JSON in backticks, so parseGemmaResponse() strips those automatically.

Adding a dedicated see_doctor field — symptoms requiring medical attention — was the most impactful UX improvement. Users shouldn't have to parse warning text to figure out when something is serious. Gemma 4 identifies these situations reliably when explicitly prompted.

Test suite: 49 unit tests, zero npm dependencies:

git clone https://github.com/monsad/medscan-assistant
cd medscan-assistant
node tests/app.test.js
# ✓ 49/49 passing
Enter fullscreen mode Exit fullscreen mode

Top comments (0)