Building a 3-second medicine verifier with Gemini vision + pg_trgm fuzzy matching

#opensource #webdev #ai #india

The problem

Indian households spend roughly ₹65,000 crore every year on out-of-pocket medicine costs. A meaningful slice of that is going toward branded drugs when the exact same molecule — same active ingredient, same dosage, same CDSCO approval — is sitting in a Jan Aushadhi store for a fraction of the price. Dolo-650 retails at ₹32 a strip; the Jan Aushadhi paracetamol is ₹4.90. The cheaper medicine isn't the issue. Nobody tells the patient it's there.

We built Agada to fix that. Snap a photo of the strip. Three seconds later you know if it's CDSCO-registered, what it actually does, and what the cheaper version costs. No login. No app install. Free.

Here's how the whole thing fits together.

The pipeline

[Phone camera] -> [client compression] -> [Gemini 1.5 Flash vision]
                                              |
                                              v (structured JSON: brand, molecule, dosage)
                                      [Supabase parallel queries]
                                      |--> CDSCO registry (300k+ drugs, pg_trgm GIN)
                                      |--> Jan Aushadhi product list
                                      |--> NPPA price ceiling
                                              |
                                              v
                                       [3 result cards]

Three things happen in parallel once the image hits the backend: Gemini extracts structured fields, and Supabase fires three independent lookups. Wall-clock is dominated by Gemini (~1.8s). DB queries finish in 80-150ms.

Fuzzy matching with pg_trgm

Drug naming in India is chaos. The CDSCO registry has Crocin 500mg Tablet IP, CROCIN 500, and Crocin Advance as distinct rows. A photo of a strip rarely matches the canonical name exactly — paper gets crumpled, fonts vary, manufacturers abbreviate.

We use PostgreSQL's pg_trgm extension with GIN indexes:

CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX drugs_brand_trgm_idx
  ON drugs USING GIN (brand_name gin_trgm_ops);

-- lookup
SELECT brand_name, molecule, manufacturer
FROM drugs
WHERE brand_name ILIKE '%' || $1 || '%'
   OR brand_name % $1        -- pg_trgm similarity operator
ORDER BY similarity(brand_name, $1) DESC
LIMIT 5;

The % operator returns true when trigram similarity exceeds pg_trgm.similarity_threshold. Combined with ILIKE for prefix matches, we get recall on messy OCR without falling over on false positives. Confidence scores are exposed in the UI — if similarity is below 0.4, we mark the result as AI Estimated instead of CDSCO Verified.

Source transparency over false confidence

Every result card carries a badge: CDSCO Verified, Jan Aushadhi / BPPI, or AI Estimated. The AI badge always includes the line "verify with a pharmacist" — we'd rather say "not found" than give a counterfeit-looking strip false legitimacy.

We deliberately don't store scan history. A user photographing psychiatric or oncology medication is sharing sensitive health info. No account, no record, no analytics on what people scan.

The data sources

CDSCO Approved Drug List — 300k+ rows, refreshed quarterly from cdscoonline.gov.in
Jan Aushadhi — janaushadhi.gov.in
NPPA price ceilings — nppaindia.nic.in
A hand-picked dataset for the highest-traffic molecules, with a local fallback when Supabase isn't configured

Stack

Frontend: React 18 + Vite, client-side image compression before upload (cuts 4MB phone photos to ~250KB JPEG)
Vision + NLG: Gemini 1.5 Flash (one call for extraction, one for the plain-English explanation)
DB: Supabase / PostgreSQL 15 with pg_trgm
API layer: Vercel serverless functions, ESM (type: module)
i18n: 6 Indian languages (EN, HI, TA, BN, TE, MR)
Hosting: Vercel

The price engine falls back to a hardcoded JAN_AUSHADHI_DB if Supabase can't be reached — every result still has a price row.

What's next

A WhatsApp bot. 500M+ Indians use WhatsApp; nobody wants to open a browser at the chemist counter. A photo-to-number flow would 10x reach overnight. Then offline mode for Tier-3 Jan Aushadhi stores with poor connectivity, and a "find the nearest Kendra" map to close the last-mile gap.

Try it: agadahealth.vercel.app
Source: github.com/AmSach/agadahealth
Team: Aman Sachan, Siddharth Lalwani, Chetna Kalra, Syed Akbar — Team Agada, Open Innovation 2026.

Built with Gemini vision + a government drug database to fight pharma price gouging. No data retention, no paywalls, no accounts.