History still matters. I just don't want it to be the default context for agents editing the present system.
A coding agent will read whatever you put in front of it.
That sounds like a strength. Most of the time it is the problem.
For a while my answer was "more Markdown" — CLAUDE.md, AGENTS.md, specs, ADRs, planning docs. It felt responsible: agents need context, so give them context.
Then I watched an agent do something confidently wrong because it had faithfully trusted a document that described a system we stopped having months ago. It wasn't a hallucination. It was obedience to history.
That's when the question shifted for me. Not "what prompt should I write?" but the quieter one underneath it: what should the agent read? I wrote recently that token efficiency is really a "what do you hand over" problem — each token should buy a verified fact, not a guess. This post is the same idea, pointed at the repo itself.
History is for humans. The present source of truth is for agents.
This is not an argument against keeping history.
I still think ADRs, migration logs, old specs, and planning docs are valuable. They explain why the system took its current shape — the tradeoffs, the constraints, the abandoned paths, the organizational memory. Humans need that. When you're reviewing a decision, onboarding to a domain, debugging a strange constraint, or trying to understand why an obvious-looking change was deliberately avoided, history is exactly the right thing to read.
But Claude Code, Codex, and similar coding agents are usually doing something different. They're about to edit the present system.
For that task, historical prose shouldn't be the primary context. The agent needs the current source of truth: current schema, current module boundaries, current public APIs, current tests, current state machines, current configuration, current dependency rules.
A human may read migration history to understand how the database evolved. But I don't want Claude Code to learn the current database by replaying every migration — I want it to read the current schema. A human may read ADRs to follow the reasoning behind a decision. But I don't want an editing agent to infer the current architecture by replaying every ADR — I want it to read the current module graph.
History explains why the system became this shape. The current source of truth explains what shape it has now. Both matter. They just serve different readers.
| Reader | What they should read |
|---|---|
| Human — trying to understand why | ADRs, old specs, migration history, design notes |
| Coding agent — trying to change what exists now | current schema, tests, types, module graph, config, public APIs |
So the rule was never "delete history." It's:
Keep history for humans.
Give agents the present source of truth.
A well-shaped repository should let an agent understand the current system without replaying its past. And the larger the system gets, the more this matters — not because history becomes useless, but because in prose it gets mixed in with obsolete truth, and the two become indistinguishable. The danger isn't the old fact. It's the old fact wearing the same clothes as the current one.
Drift is the failure mode, and agents are loyal to it
Here's the chain I keep tripping over:
duplication → drift → context pollution
The moment a fact lives in two places — once in the code, once in a Markdown description of the code — they begin to diverge. Code changes under pressure; prose changes when someone remembers. So the doc quietly goes stale, and now you have two answers to the same question with no marker for which is current.
A human reading a slightly-wrong doc squints and cross-checks. An agent doesn't squint. It's very good at trusting polluted context and acting on it decisively. The thing that makes agents fast — taking the given facts at face value — is exactly what makes drift dangerous.
So the most useful move isn't writing better docs. It's having fewer places where a fact can rot.
A boundary in CLAUDE.md is advice. A boundary in the build is architecture.
This is the line I keep coming back to.
If CLAUDE.md says "the billing module must not import from checkout," that's a polite request. Nothing enforces it; the next edit can ignore it and nobody — human or agent — gets stopped.
If that same boundary lives in ESLint config, in package.json exports, or in TypeScript project references, it's no longer advice. It's a wall. The agent doesn't have to remember the rule, because the rule is a property of the system it's editing.
That reframes "AI readability" for me. It was never about writing longer instructions. It's about making the repository itself say the true thing — structurally, where it can't drift:
- directory structure carries the architecture
- naming carries intent
- package exports carry the public API surface
- lint rules carry dependency boundaries
- schemas carry data contracts
- state machines carry workflows
- tests carry expected behavior
- config files carry the actual stack
- generated-file markers carry "do not touch by hand"
Each of these is closer to the system than any sentence about the system can be. That proximity is the whole point — the closer a fact sits to the thing it describes, the less room there is for it to lie.
Markdown should be a map, not the territory
I haven't thrown out CLAUDE.md and AGENTS.md. I've shrunk them.
Their job isn't to restate the architecture. It's to point at where the current architecture actually lives.
## Sources of truth
- Database schema: packages/database/schema
- API contract: packages/api/openapi.yaml
- Checkout workflow: src/features/checkout/checkout.machine.ts
- Public package APIs: package.json "exports"
- Module boundaries: eslint.config.js
- Commands: package.json scripts
That's a map. It survives change, because it points at the territory instead of copying it.
The trap is the opposite: a section in CLAUDE.md that restates the behavior — the checkout states, what billing depends on, what the API accepts. The day the code moves, that paragraph becomes a confident lie. And the agent will read it with exactly the same trust it gives the truth.
Specs are scaffolding — give them an exit strategy
I still like a short spec at the start of a feature. It gives Claude Code or Codex a sharper target than vague intent, and it helps me think. That part is genuinely useful.
The mistake is letting the spec stay as a second source of truth after the code exists. Once the feature ships, the durable parts of the spec should migrate into the repo's present tense. This is where the repo starts becoming readable without being explained:
| The spec captured… | …so move it into |
|---|---|
| screen states | state machine / Storybook stories |
| API behavior | OpenAPI / GraphQL / typed router |
| validation rules | Zod / Valibot / JSON Schema |
| domain states | discriminated unions / domain model |
| permissions | policy code / authorization tests |
| database rules | schema / constraints |
| done-ness | tests |
| infrastructure | Terraform / OpenTofu / Pulumi |
The spec was scaffolding around the building. You don't leave scaffolding up and then ask people to trust it over the walls.
Naming is prompt engineering, run inside the repo
This is the part I find quietly fun. Agents infer intent from names — which means names aren't only for humans anymore.
createUser()
updateUser()
deleteUser()
tells an agent almost nothing about the domain. This:
registerMember()
inviteMember()
deactivateMember()
transferWorkspaceOwnership()
tells it where it is. The same goes for structure — a layout an agent can read as a map of where behavior belongs:
| Path | What lives there |
|---|---|
domains/billing/commands/ |
state-changing operations |
domains/billing/queries/ |
reads |
domains/billing/policies/ |
authorization rules |
domains/billing/repositories/ |
data access |
A layout like that answers "where does this change belong?" before the agent has to guess. And it will guess — if the repo doesn't tell an agent where a thing lives, it'll happily invent a home for it, somewhere plausible and wrong.
So naming and layout are a kind of standing prompt: written once, into the structure, read on every single run.
The same "what do you hand over" question, one layer up
In the graphs post, the facts I wanted to hand an agent were computed — dominance, reachability, ordering — and the discipline was labeling each one verified or estimated by where it came from.
This is the same question, just earlier in the pipeline. A current schema is a verified fact about the data. A migration log you have to replay is, at best, estimated truth about the present. A lint-enforced boundary is verified; a sentence in CLAUDE.md is a hope. The repo, when it's shaped well, is a source of facts that carry their own freshness — because they are the system, not a description trailing behind it.
Present schema over migration history. Current module graph over ADRs. Live tests over old acceptance criteria. Real config over README prose. State machines over paragraphs about workflows.
None of that throws the history away. It just stops asking the editing agent to reconstruct the present from it.
Last year I tuned the instructions. This year I shaped the repo.
I want to be honest about where this comes from, because it's experience, not a benchmark.
A year ago I was fixated on the instruction layer — .claude/, .codex/, AGENTS.md, CLAUDE.md. I tuned rules, wrote skills, kept the files growing, treating the prompt around the repo as the thing to perfect.
Somewhere along the way the obsession moved. Instead of tuning the instructions, I started tuning the repo itself: pulling facts into a single source of truth, refactoring toward structure an agent could read, writing new code the same way from the start.
What I notice now is that the instruction files matter less than I expected. On a repo in the hundreds of thousands of lines, with a short AGENTS.md and rules and skills that are nowhere near perfectly tuned, I rarely feel Claude Code drift from what I intended, or stall when it's time to actually implement. The friction I used to paper over with longer instructions mostly just isn't there.
The honest caveat: the models and the agents themselves got dramatically better over the same stretch, and I can't cleanly separate "my repo got more legible" from "the tools got smarter."
But the direction of the feeling has been consistent: the more the present lived in the repo instead of in prose wrapped around it, the less I had to say to get good work out of the agent.
Where this is going
These days I keep adding defensive lines to prompts: "don't trust stale docs," "check the actual schema," "don't guess where this goes."
The better answer is not to write better warnings.
It is to shape a repo where those warnings are unnecessary.
Keep history for people. Shape the present for agents.
Human-readable history explains why the system became this way.
AI-readable repositories expose what the system is now.
Don't make agents reconstruct the present from the past.
Make the current source of truth impossible to misunderstand.
Top comments (1)
The line doing the most work here is "facts that carry their own freshness" — but I'd push on where that guarantee actually holds. Repo-as-truth fixes human-time drift (the doc rots between commits while the code moves). It doesn't fix within-a-run drift: on a multi-step change the agent reads the current schema/module graph at step 1, its own edits at steps 2–4 move them, and by step 5 it's acting on a structural snapshot as stale as any Markdown — except now the stale fact came from the repo, so it wears the trusted clothes. Dedup removes the second place a fact can rot; it doesn't make the agent re-read the place after it changed it. So the discipline isn't only "fewer spots a fact can rot," it's "re-derive structure after each edit that could have moved it."
Second push: "a boundary in the build is architecture" only holds for the boundary classes your toolchain can express — import graph (eslint), public surface (exports), data shape (schema). A lot of what actually ends up in CLAUDE.md are semantic invariants no linter has a rule shape for: "never write the ledger outside a settlement event," "this counter is monotonic," "this id is only minted in one place." There's no wall to move those into. So the advice/architecture table needs a middle row: what you can't prevent structurally, you make detectable structurally — a runtime assertion, a property test, an audit query that fails loudly. Otherwise "move it into the build" silently drops every rule the type system can't encode back into hope.
And the map is itself a fact that rots. A "Sources of truth" block pointing at packages/database/schema becomes a confident lie the day someone moves the schema — it survives change only if CI fails when any path in it doesn't resolve. An unchecked map hasn't escaped drift; it's just relocated it one level up, from stale description to stale address.