The trick to AI coding memory isn't a bigger instruction file — it's smaller, layered knoledge

#ai #productivity #coding #promptengineering

Something I see constantly is people trying to solve the "my AI forgets everything" problem by making their instruction file bigger. 500 lines, 1,000 lines, 2,000 lines of CLAUDE.md (or .cursorrules, or whatever your tool uses).

It doesn't work. Research backs this up — AI accuracy drops when context gets too long, and instructions in the middle of large files get ignored entirely. You end up with a bloated file that eats your context window before you've even asked a question.

What actually works is the opposite: small, targeted files loaded only when relevant.

After about 1,500+ sessions across 60+ projects, here's the structure I settled on:

Tier 1 — Constitution (~200 lines, always loaded)
Your standing orders. Preferences, hard rules, and a routing table pointing to everything else. "Always use TypeScript strict mode." "Never mock the database in tests." That's it. If your global file is over 200 lines, you're putting things in the wrong place.

Tier 2 — Living Memory (~50 lines, always loaded)
A short list of corrections and gotchas — things the AI keeps getting wrong. "This table stores deltas, not cumulative values." "The VS Code extension doesn't fire CLI hooks." Every entry directly prevents a repeated mistake. This is the tier that shows value fastest.

Tier 3 — Project Brains (loaded per-project)
One file per project with deep context: business rules, schemas, key files, decision log, changelog. Only loads when you're working in that directory. If you have 5 projects, 80% of knowledge is only relevant to one of them — why load it all every time?

Tier 4 — Knowledge Store (queried on demand)
A searchable database (SQLite + FTS5 or just markdown files) for reference data: full schemas, API docs, terminology. The AI searches it when it needs something specific, instead of having it crammed into the instruction file.

Session Memory (the continuity layer)
A SQLite database that logs what happened in each conversation. At the start of a new session, the AI queries it for a project briefing — recent work, decisions, open issues. No more "where were we?" dance.

The key insight is the routing table in Tier 1. Instead of stuffing everything into one file, Tier 1 just says "here's where to find X" and the AI loads the right context at the right time.

A couple things I learned the hard way:

Budget every tier strictly. 200 lines for Tier 1, 50 for Tier 2. Constraints force quality. When you hit the limit, you're forced to move things to the right tier instead of dumping them in the always-loaded file.
Don't store what the AI can derive. File structure, code patterns visible in the source, git history — the AI can read all of that. Only store what it would get wrong without the instruction.
If you use AI to summarize sessions, add safeguards. We had a summarizer with no batch limits that tried to process 50 sessions at once, hit an API error, retried the full batch in a loop, and burned through a third of a week's token budget. Batch cap of 5, a processed flag so it never retouches completed sessions, and a lock file to prevent concurrent runs. Learned that one the expensive way.

This works with Claude Code, Cursor, Copilot, Codex, Aider — anything that reads instruction files. The filenames differ but the architecture is the same.

I wrote up the full system with templates and an automated setup script and put it on GitHub if anyone wants the details: https://github.com/sms021/SuperContext

Happy to go deeper on any of this — the architecture, session memory, how to migrate an existing giant instruction file, whatever.

Top comments (3)

Thomas Landgraf • Apr 7

The tiered approach resonates — we landed on something similar but took the "project brains" layer one step further.

Instead of one project context file, each requirement gets its own Markdown file with YAML frontmatter (status, parent reference, acceptance criteria). The agent only loads the specific requirement it's implementing, not the entire project context. For a 40-file spec tree, that means the agent sees maybe 2-3 files per task instead of a massive context dump.

The directory structure mirrors the feature hierarchy, so navigation is intuitive:

speclan/features/
├── F-1234-auth/
│   ├── F-1234-auth.md
│   └── requirements/
│       └── R-5678-login-flow.md    ← agent loads just this

The win is that stale references basically disappear — each file is self-contained, and if a requirement changes, only that file gets updated. No grep-and-replace across a 2,000-line instruction file.

Full disclosure: I'm the creator of a VS Code extension called SPECLAN that manages this structure. But the one-file-per-entity approach works with any editor — it's just Markdown files in Git. Wrote about the layered .claude/rules/ side of this here if you want the instruction-layer complement.

Mykola Kondratiuk • Apr 8

I’d push back slightly - layering fragments context and forces the model to reason across multiple files. a 500-line file with poor structure is the problem, but the fix is better structure, not more files. I’ve seen layered setups create worse amnesia.

Some comments may only be visible to logged-in visitors. Sign in to view all comments.