Coding agents have no memory of why your codebase is the way it is. They
re-derive settled decisions every session, confidently redo things your team
already ruled out, and burn tokens doing it. I tried the three obvious fixes (a
CLAUDE.md file, RAG over our docs, pasting context manually) and each broke in
a specific way. What finally worked: stop treating decisions as a doc to
maintain and start serving them to the agent over MCP, scoped to the file it's
touching. Disclosure at the bottom: I built one of the tools I mention.
The actual problem
We're a small backend team living in Cursor and Claude Code. The failure mode
shows up every few days: an agent ships something clean-looking that violates a
rule nobody wrote down. The library we banned after it bit us in prod. The
migration you cannot touch. The pattern we already tried, that failed for a real
reason, that the agent happily reintroduces.
The knowledge exists. It's just scattered across Slack threads, old PR
discussions, and Jira comments. None of those are places an agent looks, and
honestly none are places the next human looks either.
The three fixes everyone tries first
1. A CLAUDE.md / .cursorrules file
Fine for a handful of stable conventions. Breaks on staleness (wrong the moment
a decision changes, nobody updates it) and budget (you can't fit real decision
history without eating your whole context window every turn).
2. RAG over Notion / your docs
Semantic is not relevant. Vector search pulls things that sound similar to the
file, not the decision that governs it, and noise in the window degrades
output. Also, half your real decisions were never written down; they happened in
a thread.
3. Pasting context by hand
Works until I get lazy, which is turn two. Scales to exactly one person.
The shift that worked: serve decisions, don't store them
A decision is not a document you maintain. It's something the agent fetches on
demand. Instead of a static rules file in every call, the agent makes a tool
call over MCP and pulls only the decisions relevant to the file it's editing.
Touch the auth code, get the auth decisions. Scoped, current, not a stale blob.
{
"mcpServers": {
"decispher": {
"command": "npx",
"args": ["-y", "@decispher/mcp-server"],
"env": {
"DECISPHER_API_KEY": "dcp_xxx",
"DECISPHER_COMPANY_ID": "your_company_id"
}
}
}
}
The other half is capture: a rules file you write by hand stays empty, so the
decisions get extracted automatically from Slack, GitHub, and Jira instead.
The part I didn't expect to care about: receipts
Every retrieval is measured. Each tool call logs the tokens it saved versus
making the agent rediscover that context the slow way.
get_context_for_file auth 480 tk served ↓ 3,840 saved
check_conventions database 212 tk served ↓ 1,696 saved
tokens_saved_net · 6,976 · ≈ $0.061 this session
Most "AI that knows your context" claims are unmeasurable vibes. A per-call
token number means I can actually tell if it's pulling its weight.
What I'm still unsure about
- Is per-file the right retrieval granularity, or per-symbol / per-topic?
- How do you keep captured decisions from going stale like a wiki does?
- Where's the line between serving everything relevant and blowing the budget?
Disclosure
I built the tool the examples come from (Decispher, in beta, free cohort open).
I left it to the end because the pattern stands on its own. decispher.com if you
want to try it. If you solve this differently, tell me in the comments.
Top comments (0)