I spent a couple of weeks asking people a pretty basic question. If you are actually running agents, past the demo, in something resembling product...
For further actions, you may consider blocking this person and/or reporting abuse
This is the memory problem that keeps showing up in real agent work. Semantic similarity is useful for recall, but it has no concept of outcome quality. The missing layer is not just more memory, it is scored memory: what worked, what failed, what was later corrected, and under what constraints. Otherwise the agent keeps retrieving familiar mistakes with high confidence.
Scored memory is the right frame, and the part I keep snagging on is where the score actually comes from. Outcome quality is not knowable at write time. You find out whether acting on a memory helped much later, if you capture it at all, and most setups never close that loop. So you end up scoring on proxies instead. Recency, did a human nod at it, did it at least not error. And those proxies are exactly where the familiar mistakes walk back in with high confidence. "Under what constraints" is the piece almost nobody stores, and I suspect it is the piece that matters most.
Exactly. The score has to be allowed to mature after the write. I would start with weak priors at capture time, then update them when the memory is reused: did it help complete the task, did a human correct it, did it cause a rollback, did it only apply under a constraint that was missing? Without that feedback loop, "memory quality" becomes a nicer name for recency plus vibes.
Recency plus vibes is the most honest description of the current state I have read. I am keeping that.
The weak priors maturing on reuse model is right, and the place I get stuck implementing it is attribution. When a task succeeds, several memories were usually in context, not one. Crediting all of them rewards the passengers that happened to ride along, and crediting the top retrieved one is often just rewarding whatever was most similar, which is the exact bias the score was supposed to correct. Your rollback and human correction signals are cleaner precisely because they tend to point at a specific memory, the one that caused the revert, rather than the whole retrieved set. The diffuse positive case is the one I cannot cleanly assign.
The constraint signal you mentioned, did it only apply under a condition that was missing, is the one I think is most underrated. A memory that worked ten times can be carrying a hidden precondition nobody wrote down, and it keeps scoring well right up until the context shifts out from under it and it fails for a reason the score never captured. Have you found a way to surface that the precondition exists before the failure teaches it to you, or is it always after the fact?
That attribution layer is the difference between memory and folklore. If a system remembers a preference, it should also know where it came from, when it was last confirmed, and whether it was a one-off instruction or a durable rule.
Without that, memory starts sounding helpful while quietly losing accountability.
"Memory and folklore" is going in the notebook next to recency plus vibes. What I like about adding provenance is that it answers a different attribution question than the one I was stuck on, and you need both. There is where did this come from, when was it last confirmed, was it a thing said once or a durable rule, which is your point and is knowable at write time. And there is which memory actually caused this outcome, which is only knowable later. The first sets how much to trust a memory before it has a track record. The second updates that trust once it has one.
Which lines up with your earlier point better than I caught at the time. Weak priors maturing on reuse, the prior is provenance. A thing the agent was told once and a thing it confirmed twenty times should not start at the same strength, and provenance is what tells them apart. Outcome is just what moves the prior after that. So the two halves compose into one model instead of competing.
The crack I still cannot close is the durable rule that was never wrong, only conditional. Provenance marks it durable, outcome keeps confirming it, both signals stay green, and the whole time it carried a precondition nobody wrote down. The context shifts and it fails for a reason neither the source nor the track record ever saw. Do you fold the constraint into provenance somehow, or is that a third thing entirely?
Great breakdown of agent memory failures. The "sounds related ≠ worked" problem is real — I've seen the same pattern with agents picking the wrong mode. You ask an agent to brainstorm, it recalls a coding session because embeddings think they're related, then starts writing code instead of exploring ideas. That's why I built Brainstorm-Mode (mehmetcanfarsak/Brainstorm-Mode on GitHub) — it enforces mode discipline at the infrastructure level with PreToolUse hooks, so the agent stays in ideation instead of downgrading to execution. Three modes (divergent, actionable, academic) each with different constraints.
That mode collapse example is a sharp version of it. The agent is not really choosing the wrong mode, it is retrieving a session that looks related and inheriting its behavior, which is the similarity trap wearing a different hat. Enforcing the mode at the hook level is interesting because it sidesteps the memory question entirely. You constrain the behavior instead of trusting the recall. I do wonder where that runs out though. Hard boundaries work when the modes are known up front. The cases that get me are the ones where the right behavior depends on how a similar attempt actually turned out last time, which no hook can know in advance.