If you've built anything non-trivial with Claude Code (or Cursor, or Cline), you know the shape of this. Session 1 is magic. Code just flows. By session 3 the agent proposes an architecture different from the one you chose. By session 7 you've got two implementations of the same feature and no one remembers which is right.
I always blamed myself for it, or half-blamed the model. Then I read Anthropic's founder playbook and found they have a name for exactly this: agentic technical debt. What stuck with me is the distinction they draw. Ordinary technical debt sits still. You can clear it in a dedicated sprint. Agentic debt compounds. Without specs and decisions written somewhere the agent can read, every session re-derives the foundational choices from scratch, and they drift. You end up with a codebase that has no coherent mental model behind it. Not because any single piece is bad, but because the pieces were never designed to fit together.
That reframed it for me. The problem was never my skill, or the model being weak. It's the opposite. The agent is strong enough to rebuild your architecture autonomously, and nothing is holding it to the one you chose. It reads the repo, writes the feature, runs the tests on its own. And because it moves at full speed, it drifts at full speed. By session 3 you've got a Frankenstein, built faster than ever. So the fix isn't a better prompt, and it isn't more memory. It's a method that holds the agent to the architecture you decided, session after session.
And before anyone asks: yes, there are memory tools for this now (ECC, memory MCPs). They help, but they solve recall, not direction. An agent can remember every past session and still drift. What was missing wasn't memory; it was something holding the agent to the architecture I'd already decided.
Here's the boring version that actually worked for me, after months of breaking and fixing it:
Document before you code. Before the first session, I write the decisions down: business model, PRD with explicit in-scope / out-of-scope, DB schema in plain language, system architecture, and a CLAUDE.md with the hard rules the agent must read every session. The out-of-scope list matters more than it looks. It's what stops the agent from "helpfully" rebuilding something next week.
Run each session in three phases. Phase 0: the agent reads the docs. Phase 1: it validates the scope and flags conflicts with what's already decided. Phase 2: it implements the declared scope, and I review at the end by reading the branch's commits. Skip the order and it drifts again.
Record each real decision as an ADR. Context, decision, rationale, rejected alternatives. Next session the agent reads the ADR instead of re-litigating the choice. This is the single highest-leverage habit. It's what keeps the debt from compounding.
I'd like to hear how others handle this. Do you keep an ADR log? Does your CLAUDE.md survive past a week, or does it rot? What's the session ritual that holds for you?
Full disclosure: I packaged the exact templates, the session ritual, and a production codebase I built solo with this method into a kit. Link's in my profile if you want it. But the three steps above are the whole idea, and they cost nothing to try tonight.
Top comments (0)