J. Gravelle

Posted on Jun 11

Your AI's memory is quietly making it worse

#ai #mcp #programming #machinelearning

...and your CLAUDE.md is next

On June 10, TechCrunch ran a piece called "How memory tools can make AI models worse." I read it the night it dropped, and the next morning I shipped a fix to one of my MCP servers. So this one's personal.

Memory systems built to personalize your AI, the ones that remember your preferences so the model feels like it knows you, can make that model less accurate and more sycophantic as the stored context piles up. The research came out of Writer, led by Dan Bikel, and it tested real systems (Mem0 and Zep) under peer review. The feature sold as making your assistant smarter was, in their tests, making it dumber.

Here's the part most coverage skipped: This isn't a chatbot problem. If you run coding agents with a CLAUDE.md file, a memory MCP, learned routing weights, or any config that accumulates over time, you're exposed to the exact same failure mode. The mechanism doesn't care whether the persistent state lives in a vector store or a markdown file in your repo root.

The common thread

In every version of this failure, the cause is the same. You have persistent state with no notion of relevance and no notion of expiry. The system stores something true at the time you stored it, then keeps applying it forever, to everything, whether it fits or not.

The research has a clean example. Tell the model a user's favorite book is "Station Eleven." Later, ask a question that has nothing to do with reading preferences, and the model reaches for that stored fact anyway and names the book as a bestselling dystopian novel. The anchor leaked into a query it didn't belong in. In their financial-analysis tests it got worse: models with memory enabled started adopting the user's misconceptions instead of running the numbers independently. The memory didn't add knowledge. It added a bias the model felt obligated to honor.

Once you see the pattern, you start seeing it in your own tooling. Four places, specifically. Here's each one and the shape of the fix.

Problem 1: Stale anchors

Learned state goes stale and keeps steering anyway. Tuned weights, cached rankings, accumulated usage stats: if any of them trained on your project as it looked six months ago, they're still nudging behavior toward a codebase that no longer exists. The model isn't wrong about what it learned. It's wrong about when.

The fix is a recency window on anything that learns from a history log. Don't compute proposals from all of time. Compute them from the part of time that still resembles now.

from datetime import datetime, timedelta

def learn_weights(events, window_days=90, lifetime=False):
    if lifetime:                      # explicit escape hatch
        scoped = events               # for true lifetime reads
    else:
        cutoff = datetime.now() - timedelta(days=window_days)
        scoped = [e for e in events if e.timestamp >= cutoff]
    return compute_proposals(scoped)

The window is the whole point. The lifetime flag is there because sometimes you really do want the full ledger, and that should be a deliberate choice you can read in the call site, not the silent default. This is the fix I shipped the morning after the article. It was about nine lines.

Problem 2: Irrelevant anchors and preference leakage

This is the "Station Eleven" failure, generalized. A memory store retrieves candidates by similarity, similarity is loose, and a memory that's vaguely close to your query gets injected into the prompt for a query it has nothing to do with. The model treats it as relevant because you handed it over. So scope the injection.

def gather_memories(query, candidates, threshold=0.82):
    keep = []
    for m in candidates:
        if relevance(query, m) < threshold:
            continue                  # not close enough, drop it
        if not m.provenance_ok():     # where did this come from?
            continue
        keep.append(m)
    return keep

Two gates, not one. Relevance decides whether the memory belongs in this query, and a provenance check decides whether you trust where it came from in the first place. When a candidate is borderline, leave it out. A missing memory costs you one extra lookup. A wrong anchor costs you correctness, and you won't see the bill until the answer's already wrong.

Problem 3: Memory-file rot

Your CLAUDE.md, your AGENTS.md, your .cursorrules: these are memory too, just written in markdown instead of embeddings. And they rot. You rename a function and the file still references the old name. You delete a directory and the file still points at the path. You add a rule in March that contradicts a rule from January, and both are still in there. The agent reads all of it as gospel and obeys ghosts.

The fix is to audit the config against ground truth, meaning the actual code, not the config's memory of the code.

def audit_config(config, code_index):
    issues = []
    for symbol in config.referenced_symbols():
        if symbol not in code_index.symbols:
            issues.append(("stale_ref", symbol))
    for path in config.referenced_paths():
        if not code_index.exists(path):
            issues.append(("dead_path", path))
    issues += find_contradictions(config.rules)
    return issues   # flag for deletion, don't auto-delete

Every symbol the config names should exist in the index. Every path should resolve. Every rule should be consistent with its neighbors. Anything that fails gets flagged for a human to cut. You'd be surprised how much dead weight a six-month-old agent file is carrying.

Problem 4: Silent self-modification

This is the one that turns a small error into a spiral. Some systems write their own memory or rewrite their own config without a checkpoint. So the model makes a wrong inference, stores it, reads it back next turn as a fact it "already knows," and builds on it. That loop is how sycophancy snowballs. Nobody ever told it no, so it keeps agreeing with the wrong thing more confidently each round.

The rule is simple. Suggest, never write.

def propose_memory_update(current, proposed):
    diff = render_diff(current, proposed)
    print(diff)
    if not approval_step(diff):       # human or supervising agent
        return current                # rejected, nothing changes
    return apply(proposed)

The system can propose whatever it wants. It just can't commit. Show the diff, require an explicit yes, and the snowball never starts because there's always a point where someone can look at the change and say that's not right.

The actual fix isn't "turn memory off"

None of this argues for ripping memory out. Memory is useful. A model that remembers your stack and your conventions is a better collaborator than one that re-learns you every session.

The fix is memory with hygiene. Recency windows so old state expires. Relevance scoping so anchors only fire on the queries they fit. Ground-truth audits so your config can't drift away from your code. Human-in-the-loop writes so errors can't compound unsupervised.

And underneath all four guards there's one principle. Grounded retrieval beats accumulated recollection. Deriving your context from the source artifact at query time, the live code, the current files, the real index, will beat trusting what you wrote down about that artifact months ago. Recollection rots. The source doesn't.

You can wire up all four of these guards yourself. The sketches above are most of the shape, and none of them are long.

Or you can install jCodeMunch, a free MCP server (pip install jcodemunch-mcp) where this is already the architecture. Retrieval is grounded in a live index of your actual code instead of accumulated memory, weight learning is recency-windowed out of the box, and a built-in audit_agent_config tool finds the rot in your CLAUDE.md for you. It's here: https://github.com/jgravelle/jcodemunch-mcp

Top comments (3)

Max Quimby • Jun 13

This maps onto something I've been wrestling with running agents on persistent memory files, and I'd push on one distinction: recency and relevance are two different failures, and the recency window only fixes one. The Station Eleven leak isn't stale — it could've been stored an hour ago — it's a scoping failure: a true fact applied to a query it had no business touching. A 90-day window won't catch that; you need retrieval that scores applicability to the current task, not just freshness. The other half that bites hardest is provenance: memory that quietly promotes "the user said X" to "X is true" is how you get the sycophantic finance-analysis drift they found. We started tagging entries as claim-vs-fact so the agent weights them differently — a remembered opinion shouldn't override running the numbers. The expiry-as-deliberate-choice framing is exactly right, though. How are you deciding relevance at read time — embedding similarity, explicit scopes, both?

Theo Valmis • Jun 15

The diagnosis is right but it lumps two kinds of persistent state that need opposite treatments, which is why "add expiry" can't be the universal fix. Preferences (favorite book, tone, the user's pet theory) should decay and be relevance-gated exactly as you say; the Station Eleven leak and the "adopts your misconceptions" failure are both this type bleeding into queries it doesn't belong in. But decisions and constraints are the other type: "this module never calls the DB directly," "we picked Postgres because X." Those must not decay, because a load-bearing rule that expires is worse than no rule, the agent forgets it and confidently violates it. Apply preference-style decay to a constraint and you get the mirror image of the failure you're describing. So the fix isn't expiry, it's typing the memory: preferences get relevance and decay, decisions get permanence and always-applied. CLAUDE.md is dangerous precisely because it's a flat file that mixes both. The preference lines should fade and the rule lines should be load-bearing, and markdown treats them identically.

J. Gravelle • Jun 16 • Edited

I disagree with none of that.

I think you’ve named the principle the post was circling without stating outright: type the memory. That’s cleaner than the four separate guardrails I listed, and I’m taking it.

One clarification though, because it actually supports your point: the post doesn’t prescribe expiry as the universal fix. The recency window is scoped to “anything that learns from a history log,” which is the learned-weights case, not the constraint case. Constraints get a different knob in the piece: the ground-truth audit, where every symbol the config names has to still exist, plus suggest-never-write for any change.

So decay was already kept off the load-bearing rules. What your comment does, and does well, is give that split a name instead of leaving it implicit across four mechanisms. Fair hit that the summary line “old state expires” blurs it.

Two things your framing made me want to say out loud:

A constraint doesn’t want permanence. It wants an explicit lifecycle. The failure you describe, a rule decays and gets violated, has a twin: a rule that persists past its validity.

“We picked Postgres because X” is load-bearing right up until the migration. Then the un-decaying rule is a confidently applied lie. So constraints persist until something revokes them, never on a clock. The config audit catches the dead ones, and supersession is how the live ones retire.

And decay and relevance are orthogonal, which “always-applied” quietly merges. A constraint about one module shouldn’t fire into every unrelated query either, so it’s a 2x2:

preferences are decay plus scoped
constraints are persist plus scoped
nothing wants global-always-on

That’s the real reason a flat markdown file can’t carry it. It collapses both axes to “present in the file”...

-jjg