The problem
Indian households spend roughly ₹65,000 crore every year on out-of-pocket medicine costs. A meaningful slice of that is going toward branded drugs when the exact same molecule — same active ingredient, same dosage, same CDSCO approval — is sitting in a Jan Aushadhi store for a fraction of the price. Dolo-650 retails at ₹32 a strip; the Jan Aushadhi paracetamol is ₹4.90. The cheaper medicine isn't the issue. Nobody tells the patient it's there.
We built Agada to fix that. Snap a photo of the strip. Three seconds later you know if it's CDSCO-registered, what it actually does, and what the cheaper version costs. No login. No app install. Free.
Here's how the whole thing fits together.
The pipeline
[Phone camera] -> [client compression] -> [Gemini 1.5 Flash vision]
|
v (structured JSON: brand, molecule, dosage)
[Supabase parallel queries]
|--> CDSCO registry (300k+ drugs, pg_trgm GIN)
|--> Jan Aushadhi product list
|--> NPPA price ceiling
|
v
[3 result cards]
Three things happen in parallel once the image hits the backend: Gemini extracts structured fields, and Supabase fires three independent lookups. Wall-clock is dominated by Gemini (~1.8s). DB queries finish in 80-150ms.
Fuzzy matching with pg_trgm
Drug naming in India is chaos. The CDSCO registry has Crocin 500mg Tablet IP, CROCIN 500, and Crocin Advance as distinct rows. A photo of a strip rarely matches the canonical name exactly — paper gets crumpled, fonts vary, manufacturers abbreviate.
We use PostgreSQL's pg_trgm extension with GIN indexes:
CREATE EXTENSION IF NOT EXISTS pg_trgm;
CREATE INDEX drugs_brand_trgm_idx
ON drugs USING GIN (brand_name gin_trgm_ops);
-- lookup
SELECT brand_name, molecule, manufacturer
FROM drugs
WHERE brand_name ILIKE '%' || $1 || '%'
OR brand_name % $1 -- pg_trgm similarity operator
ORDER BY similarity(brand_name, $1) DESC
LIMIT 5;
The % operator returns true when trigram similarity exceeds pg_trgm.similarity_threshold. Combined with ILIKE for prefix matches, we get recall on messy OCR without falling over on false positives. Confidence scores are exposed in the UI — if similarity is below 0.4, we mark the result as AI Estimated instead of CDSCO Verified.
Source transparency over false confidence
Every result card carries a badge: CDSCO Verified, Jan Aushadhi / BPPI, or AI Estimated. The AI badge always includes the line "verify with a pharmacist" — we'd rather say "not found" than give a counterfeit-looking strip false legitimacy.
We deliberately don't store scan history. A user photographing psychiatric or oncology medication is sharing sensitive health info. No account, no record, no analytics on what people scan.
The data sources
- CDSCO Approved Drug List — 300k+ rows, refreshed quarterly from cdscoonline.gov.in
- Jan Aushadhi — janaushadhi.gov.in
- NPPA price ceilings — nppaindia.nic.in
- A hand-picked dataset for the highest-traffic molecules, with a local fallback when Supabase isn't configured
Stack
- Frontend: React 18 + Vite, client-side image compression before upload (cuts 4MB phone photos to ~250KB JPEG)
- Vision + NLG: Gemini 1.5 Flash (one call for extraction, one for the plain-English explanation)
-
DB: Supabase / PostgreSQL 15 with
pg_trgm -
API layer: Vercel serverless functions, ESM (
type: module) - i18n: 6 Indian languages (EN, HI, TA, BN, TE, MR)
- Hosting: Vercel
The price engine falls back to a hardcoded JAN_AUSHADHI_DB if Supabase can't be reached — every result still has a price row.
What's next
A WhatsApp bot. 500M+ Indians use WhatsApp; nobody wants to open a browser at the chemist counter. A photo-to-number flow would 10x reach overnight. Then offline mode for Tier-3 Jan Aushadhi stores with poor connectivity, and a "find the nearest Kendra" map to close the last-mile gap.
Try it: agadahealth.vercel.app
Source: github.com/AmSach/agadahealth
Team: Aman Sachan, Siddharth Lalwani, Chetna Kalra, Syed Akbar — Team Agada, Open Innovation 2026.
Built with Gemini vision + a government drug database to fight pharma price gouging. No data retention, no paywalls, no accounts.
Top comments (0)