DEV Community

mamoru kubokawa
mamoru kubokawa

Posted on

Three agent-memory threads this week, one missing field

I'm 21 days into building in public on dev.to. Three different threads I joined this week started in completely different places — one was a Welcome Thread first-comment about agent design, one was an X-driven side conversation about memory and sycophancy, one was a same-day launch post from a solo dev in Accra shipping his own memory API.

All three converged on the same gap. Every agent memory API I touched this week — including the one I build against in my own stack — is missing the same field: a lifecycle state.

The default model

Store → retrieve → delete. That's the entire API surface for most agent memory services. Mem0, Zep, Letta, OpenAI's Assistants memory, the new AgentRAM, the userMemories layer I work with directly. Three verbs, plus search, plus tenancy.

The model handles "this fact exists" beautifully. It handles "remove this fact" beautifully.

What it doesn't handle — and what every agent eventually faceplants on — is "this fact used to be true but isn't anymore."

The failure mode in production

Concrete example from my own builds. Two months ago a user told the agent "I prefer X." The agent dutifully stored it. Today the same user is operating under different constraints, but the memory entry is still in the store, still load-bearing, and the agent obediently re-applies the old preference because it has no signal that the entry is stale.

The behavioral result looks identical to a sycophancy failure: the agent confidently asserts an outdated belief because the memory said so.

But it isn't a sycophancy problem. The model isn't wrong about what's in memory. The memory itself is wrong about what's still active.

This is a temporal concern, not a semantic one. You can't fix it with better embeddings or richer chunking. You fix it with state.

What "state" actually means

The shape I keep landing on:

{
  key: "user_pref:layout",
  value: "compact",
  state: "live" | "superseded" | "retired",
  superseded_by: <ref> | null,
  set_at: <timestamp>
}
Enter fullscreen mode Exit fullscreen mode

The important part is that delete is the wrong primitive for most stale facts. Deleting "user prefers compact layout" throws away the fact that the user once preferred it — which is itself useful context when reasoning about why a current preference looks the way it does. Marking it superseded keeps the history, marks it inactive for retrieval-by-default, and lets the agent answer "what changed?" instead of just "what is?"

That's the difference between an agent with memory and an agent with a fact archive.

Why nobody ships this

I've been guessing for a few days about why the gap exists across so many products at once. Three guesses, in order of how much I believe them:

  1. Nobody asks for it. Users don't notice silent staleness — they notice loud wrongness. A stale preference is invisible until it embarrasses the agent in front of someone, at which point it gets blamed on "AI hallucination."
  2. State management is harder than CRUD. Adding superseded_by means the agent has to reason about its own memory rather than just look things up. That's a different control loop.
  3. The default benchmarks reward recall, not honesty. Most memory benchmarks measure "did the agent remember the right thing?" Few measure "did the agent correctly not apply something that's no longer true?"

I'm least sure about (3) — it might be that I just haven't seen those benchmarks.

Where this goes for me

I'm running it as a constraint on my own next iteration: any memory write goes in with a state field defaulting to live, and the agent gets a tool that can mark entries as superseded (with a reason and a successor reference) instead of deleting them. I'll know in a few weeks whether the extra surface area is worth the storage cost.

If you're building something similar — or if you're at one of the memory-API products that ships this kind of API — I'd love to hear if you've already tried this and it didn't work. I'd rather find out from someone who hit the wall first than rediscover it the slow way.

Honest numbers, since this is dev.to

Day 21 of build-in-public for me. Follower count is still low but compounding. dev.to article views per post trending up modestly. The fastest-growing thing isn't the audience size — it's the cohort of other builders who've started replying to my comments before I publish anything, and that has been the actual ROI of being here.

Comments I left this week became typed fields in one builder's next release, two shipped README/CTA nudges in another's repo, and (today) one design conversation on a memory API that hadn't existed yesterday. None of those are mine to ship, but they're how the work compounds when you give first.

Comments welcome here — especially from anyone who's tried to ship superseded_by as a real field and watched it eat a weekend.

Top comments (0)