DEV Community

P. Sai Nieojitha
P. Sai Nieojitha

Posted on

my new project

How I Stopped My Meal Agent From Inventing Ingredients

Most AI meal assistants feel impressive for about five minutes.

Then you realize they:

suggest recipes using ingredients you don’t have,
forget dietary restrictions,
ignore previous reactions,
and rerun expensive reasoning every interaction like nothing happened.

I wanted to see what would happen if a household AI agent actually behaved more like a constrained system instead of a creative autocomplete engine.

So I built ChefOS: a memory-aware meal planning agent with deterministic verification, runtime model routing, and long-term behavioral memory.

The interesting part wasn’t generating recipes.

It was making the agent trustworthy.

The Problem With “Creative” Meal Agents

Most LLM-based recipe systems optimize for fluency.

That’s great until the model confidently recommends:

paneer when none exists,
spicy meals after reflux incidents,
or recipes requiring ingredients the user explicitly removed from inventory.

The deeper issue is that most assistants treat constraints as soft suggestions instead of hard rules.

Inventory becomes context instead of a whitelist.

Health restrictions become hints instead of safety requirements.

I wanted to flip that architecture.

The Core Design

ChefOS runs on three ideas:

Persistent memory using Hindsight
Runtime routing + verifier gating using cascadeflow
Deterministic inventory validation after inference

The flow looks roughly like this:

User Request

Memory Recall (Hindsight)

Constraint-Aware Prompt

cascadeflow Routing

Primary Model Response

Verifier Gate

Inventory + Safety Validation

Render Response

The biggest architectural decision was separating:

generation
from
verification

That changed everything.

Memory Became More Useful Than I Expected

I initially added Hindsight to persist user preferences across sessions.

But the more interesting behavior came from storing negative outcomes.

Example:

Dad got acid reflux after spicy curry.

Instead of storing raw chat history, the agent reflects that interaction into structured behavioral memory:

{
"type": "safety_constraint",
"rule": "Avoid spicy meals for Dad",
"confidence": 93
}

Over time, ChefOS builds “mental models” from repeated interactions.

That created noticeably different behavior after multiple sessions:

fewer unsafe suggestions,
less repetitive reasoning,
and more grounded recommendations.

The agent stopped feeling stateless.

Why I Added a Verifier After Generation

This was probably the most important system change.

Even with strong prompting, the generation model still occasionally hallucinated ingredients.

So instead of trusting the output directly, I added a second verifier pass.

The verifier checks:

ingredient availability,
allergy/reflux constraints,
preference conflicts,
and inventory consistency.

A simplified version looks like this:

const invalidIngredients = recipe.ingredients_used.filter(
item => !fridgeInventory.includes(item.toLowerCase())
);

if (invalidIngredients.length > 0) {
blockRecipe();
}

This sounds simple, but it dramatically improved reliability.

The interesting part is that the UX changed too.

Instead of showing:

“Recipe blocked”

the assistant now responds conversationally:

✨ Almost Ready

Egg Fried Rice would work well here, but rice isn’t currently available in your kitchen.

Scrambled Eggs is fully ready to make right now.

That tiny change made the system feel much more intentional.

cascadeflow Ended Up Solving Two Problems

I originally added cascadeflow for model routing.

Simple requests:

inventory updates,
short suggestions,
preference retrieval

run on smaller models.

More complex requests:

multi-user dietary conflicts,
safety reasoning,
verifier escalation

route to larger reasoning models.

The obvious benefit was lower cost and latency.

The less obvious benefit was explainability.

I exposed the routing decisions directly in the UI:

Llama 8B → Escalated → 70B → Verifier

That made the agent feel much less like a black box.

The Most Interesting UX Decision

I removed almost every “error-style” message.

No:

validation failure,
blocked response,
constraint violation,
rejected recipe.

Instead:

“Almost Ready”
“Tailored for Comfort”
“Safety Checked & Ready”

This sounds cosmetic, but it changed how the system felt during real interactions.

The agent stopped feeling like a debugger and started feeling collaborative.

What Actually Improved After Adding Memory

The clearest improvement wasn’t recipe quality.

It was behavioral consistency.

Without memory:

recommendations felt generic,
health rules were forgotten,
and repeated context had to be re-entered constantly.

With memory:

reflux-safe meals persisted,
ingredient dislikes stayed enforced,
and recommendations adapted over time.

Interaction 1 felt like a chatbot.

Interaction 20 felt much more like a constrained assistant with continuity.

That gap ended up being the most convincing part of the system.

Things I’d Improve Next

A few areas still need work:

stronger inventory normalization,
confidence decay for stale preferences,
multi-user conflict resolution,
and better verifier parsing for ambiguous ingredients.

I also want to experiment with local models for low-cost background memory reflection.

Final Thought

The most useful change wasn’t making the agent smarter.

It was making the agent more constrained.

Once memory, routing, and verification worked together, the system stopped behaving like a generic assistant and started behaving more like software with accountability.

That distinction matters more than I expected.

Links:

Hindsight: Hindsight GitHub
cascadeflow: cascadeflow GitHub

Top comments (0)