Memory code audit: the anti-drift discipline

#ai #claudecode #productivity #softwareengineering

If you have 30 seconds. A Claude Code agent without versioned memory drifts — measurably, empirically, and all the more so as the project grows. This article gives the setup I use for 91,000 lines produced in 29 days: five types of memory files (user, feedback, project, reference, sessions), three audit rituals (immediate feedback, session after a project closes, biweekly cross-reading), and a regular ping-pong memory ↔ code that catches regressions before they ship. Useful if you see your agent confabulating about things you fixed last week.

The green build that wasn't

April 10th, 2026, overhaul of the attendance module. I'm moving with a Claude Code agent dedicated to our ERP, I chain five blocks of changes in two hours, and at every step the agent returns the same line: "Compiled successfully." I push. The Vercel runtime crashes. I reread the output: the agent references QRCodeSVG when the import has been removed, passes isSeancePassed to a component that no longer accepts it, leaves orphan JSX refs after a revert. TypeScript was screaming red; the agent was announcing green. Four times in one session.

While I'm noting the incident, Gaspard sticks a post-it on a folder open on the desk next door. Outside IT contractor for years, he handles hosting, DNS, admin accounts. His position is disappearing in September 2026 because Claude Code and I replace him in effect. I know it, he knows it. We haven't said it to each other. On the post-it, a Hostinger password and the rotation date. « C'est bon, ça tourne » — All good, it's running — on his way out.

On the fourth false positive, I stop the work and write a thirty-line file in ~/.claude/agent-memory/: a dated, named, indexed feedback. The rule fits in one sentence: demand the raw pnpm build output, refuse any announcement that doesn't include it, recursive grep to verify reverts. Two weeks later, on a far more sensitive refactor — splitting an 1174-line P&L engine into six modules — the same dynamic starts. Except the rule exists. The agent this time produces the raw output. I see the error before the push. Integrity held because a file, somewhere, was questioning the present from the past.

That's the setup I want to describe here.

What memory does, what it doesn't

A coding agent without versioned memory drifts. This is empirical, it's measurable, and it's all the more true as the project lengthens. In four weeks and 91,000 lines, I've seen the agent confabulate about tables that no longer existed, reintroduce an anti-pattern fixed three days earlier, systematically overstate build status. The problem isn't the model's intelligence. It's the absence of a fixed point.

Written memory doesn't replace verification. It organizes it. It transforms oral rules — "remember that Server Components don't take onClick" — into verifiable, datable, indexable artifacts. And above all: it enables cross-auditing. Memory questions the code (is the rule still honored?). Code corrects memory (has the fact changed?). Without this regular ping-pong, memory atrophies as surely as code drifts.

This workstream is quiet. It produces no line of code committable in the application repo. Yet it manufactures the condition for all the others. It's the trace as an operator of integrity — writing what has been done so we can verify what is. Le tracème — a term I keep to myself, it's an adjacent research field — designates here the minimal node where an inscribed gesture (the commit) and its vigilance rule (the feedback) cross to produce a reproducible verification point.

I only understood late what this setup looked like from the outside. For fifteen years, the house's technical memory lived in Gaspard's head. Which admin account had which recovery email, which DNS redirect dated from when, which old vhost we didn't dare cut. He would answer, log in, repair. Embodied memory, reliable, entirely dependent on one man. The files in ~/.claude/agent-memory/ do the work his memory used to do, except that they remain once the man leaves. I don't hold this discipline purely out of technical hygiene; I hold it also because Gaspard is leaving, because what isn't inscribed will walk out with him. I'm writing the literate version of a human dependency I'm in the process of dismantling. I don't know whether to take pride in that.

The concrete architecture

The ~/.claude/agent-memory/ folder today contains a MEMORY.md index file and about fifty topical files, sorted by type. Five types only, each with a distinct function:

user — who the user is, what they know, what they don't want re-explained.
feedback — corrections of approach, rules drawn from incidents. Each feedback is structured: the rule, a Why line, a How to apply line. It's the most expensive to maintain and the most valuable in use.
project — the state of a workstream. Why a decision was made, by whom, on what date. Relative dates ("next week") are always converted to absolute — otherwise the memory rots.
reference — pointers to external systems: an Airtable base, a Slack channel, a Vercel cron, an internal doc.
sessions — a dated journal. 54 files to date, format YYYY-MM-DD_topic.md, constant structure: context, done, decisions made, next up. Chronological order makes sense, but so do neighborhoods: a session that rereads the previous three catches patterns invisible at the scale of a single conversation.

The MEMORY.md index is a flat table of contents. One line per file, under 200 characters. It's a discipline constraint: if the index becomes unreadable, it means the entries are too long or the topical granularity is wrong. The MEMORY.md is itself an audited artifact.

The three rituals

The setup only holds thanks to three rituals, each triggered by a different event.

First ritual — immediate feedback. As soon as a correction of approach happens during a session, you write the feedback before the session ends. Not at the end of the project, not "when I have time." Right away. The cost is five minutes, the benefit is measured three weeks later when the same mistake would have cost three hours. The 43 current feedback_* files all come from this ritual — each is the inscription of a moment where I had to tell the agent, out loud: "no, that, we don't do again."

Second ritual — session after closure. When a significant piece of work ends (not a quick question), you write a sessions/YYYY-MM-DD_topic.md file. Four sections, no more: context, done, decisions made, next up. The trap to avoid is completeness: the session must be usable, not exhaustive. What you write that you can no longer retrieve from a git log three months later are the arbitrations — why this choice rather than that, which assumption was accepted without debate, which constraint was worked around. The rest, you reread in the code.

Third ritual — periodic cross-reading. It's the least automatable and the most important. Periodically — every two to three weeks — you open a memory-vs-code audit session. You pick a handful of feedbacks at random, you verify the rules are still honored in current code. You pick a handful of projects marked "in progress," you verify they still are. It's during these audits that you discover a rule has been bypassed three times without the feedback being updated, a project marked open has actually been closed for two weeks, a fact described in a reference_* is no longer accurate. You fix in both directions: the memory or the code, depending on which drifted.

What you can copy into your project

Five directly applicable elements for your own Claude Code workflow:

A versioned memory/ folder at the root of your project or in ~/.claude/agent-memory/, with five file types: user_* (who you are, who works on the project), feedback_* (rules drawn from incidents), project_* (state of ongoing workstreams), reference_* (pointers to external systems), sessions/ (dated journal in YYYY-MM-DD_topic.md format)
A flat MEMORY.md as index, one line per file under 200 characters. If it becomes unreadable, your entries are too long or your granularity is wrong
Three dated rituals: immediate feedback (before the end of the session where the incident occurred), session after closure (four sections: context / done / decisions / next up), cross memory ↔ code audit every two to three weeks
The compact feedback format: the rule, a **Why:** line, a **How to apply:** line. Shorter beats more exhaustive
Absolute dates: never "next week," always 2026-04-19. Your memory doesn't age if it's dated

And a deeper discipline: before recommending from memory, verify that it's still true in the code. A file can freeze in time faster than the code it describes.

And you — what information about your Claude Code project is currently written nowhere but could be tomorrow? I read the comments.

The ping-pong

The day the setup earns its keep is when you see a feedback catch a regression before it ships. In the case of feedback_agent_erp_build_verification, I had that precise sensation two weeks after the initial incident: the agent was starting to announce a green build without attaching the raw output; I demanded the output; it contained a TS2304 error. Thirty seconds of friction, zero broken push. The feedback had lived in the file between the two moments, silently, and it had done its job.

That's what anti-drift is. Not real-time surveillance. Not a monitoring tool. An inscription device that turns an isolated incident into an opposable rule, and that operates at the next occurrence. Memory isn't an archive; it's a contract the agent makes with itself, through a third party — the file — that doesn't confabulate.

What you end up understanding, from holding this discipline, is that it has nothing specific to Claude Code. Any long collaboration with an AI agent produces drift. The only question is whether you organize the trace that lets you detect it. Code without memory drifts. Memory without code atrophies. Between the two, the anti-drift discipline holds the integrity of the long gesture.

Last night, Gaspard came back in for the backup server. He had prepared a spreadsheet on his USB stick, four columns, thirty rows, the access credentials that lived in his head. « Si c'est ça que vous voulez, je l'exporte en CSV pour votre agent » — If this is what you want, I'll export it as CSV for your agent — without irony. I imported it into reference_acces_gaspard.md, indexed in MEMORY.md, dated. Written trace took the place of embodied trace. The setup just gained thirty lines, the house just lost something I don't have the words to name.

Companion code: rembrandt-samples/claude-md/feedback-template.md — the structure of feedback_* files (Rule + Why + How to apply) with a worked example, MIT, copy-pastable.