DEV Community

Salisu Adeboye
Salisu Adeboye

Posted on

Solving the "Fridge Occlusion" Problem: Building a Multi-Modal Input for Metabolic Health

Why Computer Vision isn't enough for Medical-Grade Nutrition

I just pushed Glucoforager to the App Store and Google Play. It’s an AI engine designed to help diabetics turn random fridge ingredients into glycemic-safe meals in under 60 seconds.


While building the MVP, I hit a massive wall: The "Dark Fridge" Problem.

The Technical Dilemma: CV vs. NLP
My original goal was a pure Computer Vision (CV) experience. You snap a photo, the model identifies the ingredients, and the recipe generator does the rest.

The reality? Kitchens are messy. Spinach gets hidden behind milk cartons. Labels are turned away from the lens. In a health-critical app, an 85% confidence score on ingredient recognition isn't a "success"—it’s a safety risk.

My Current Solution: The Multi-Modal Pipeline
To solve this, I’ve implemented a dual-pathway input layer:

Vision Pipeline (The Scan): Optimized for identifying bulk proteins and produce.

NLP Pipeline (The Text): A natural language fallback where users can type "I have half an onion and some leftover salmon."

The system merges these inputs into a single "Current Inventory" state before hitting the recipe generation API.

The Conflict: UX Friction vs. Clinical Accuracy
Here is where I need your perspective. If the CV model only identifies 3 out of 5 items in a photo, should the system:

Auto-complete based on "common ingredient pairings" (high magic, high risk)?

Interrupt the flow and force a manual text confirmation (low friction, high safety)?

Help me "Break" the Beta
I am currently scaling to 10,000 users and I need fellow devs to stress-test the recognition logic.

How to help:

Download the app (iOS or Android): www.glucoforager.com

Try to "trick" the scanner with low light or overlapping items.

Drop a comment here: Which do you find yourself using more—the Scan or the Text Input?

I’m specifically looking for feedback on latency and the "confidence threshold" for ingredient identification.

Top comments (0)