My AI Agent Keeps Forgetting Everything; So do I...
I have multiple sclerosis. Some days are better than others, but one thing is constant: repeating myself is expensive. Cognitive fatigue means every wasted explanation costs me something I can't get back. So when the AI coding agent started each session from scratch, forgetting every architecture decision, every constraint, every piece of context I'd painstakingly built up, it wasn't just annoying. It was a genuine problem.
AA-MA Forge
The context wall
If you've used Claude Code (or Cursor, or Copilot) for anything longer than a single session, you know the feeling. Monday morning, you open a new conversation. The agent has no memory of Friday's work. You re-explain the architecture. You re-state the constraints. You watch it drift from the plan you agreed on two days ago. Three sessions in, you've spent more time re-establishing context than writing code.
For small tasks, this is tolerable. For multi-week projects with dependencies, milestones, and real stakes, it's a dealbreaker.
What I tried first
Big instruction files. Massive CLAUDE.md documents stuffed with architecture summaries, coding standards, and project history. They helped, but they mixed things that change (execution state, what's done, what's next) with things that don't (API endpoints, file paths, schema definitions). The agent couldn't tell the difference. It would hallucinate facts that were sitting right there in the doc, or re-litigate decisions I'd already made.
Conversation summaries were worse. Lossy compression of context meant the important details evaporated first.
The spark
At 3am one night, scrolling Reddit because my brain wouldn't shut up and the MS "tingled" me awake, I found Diet-Coder's post, and something about a "Dev Docs System": three files per task that give the agent structured memory. Plan, context, tasks.
That was the seed. I took those three files and turned them into five.
Why five, not three
Three files tangle different kinds of knowledge together. Strategy sits next to execution state. Facts mix with decisions. When the agent loads context, it can't prioritise. It reads everything, weighs nothing.
Five files separate knowledge by how it behaves:
- Things that don't change (API endpoints, file paths, constants) go in one place.
- Things that explain why (decisions, trade-offs, gate approvals) go in another.
- Where you are right now (task status, what's done, what's next) gets its own file.
- Strategy (the plan, milestones, acceptance criteria) stays separate from execution.
- What happened (commits, session checkpoints, audit trail) goes in an append-only log.
When the agent picks up a new session, it loads the facts and the task state first. It only pulls in the decision history when it needs to make a choice. The plan stays available but doesn't clutter working memory.
The separation sounds obvious in hindsight. It took months of trial and error and battle tested against real projects and deliverables to get right - or at least working well enough to stop me screaming at the machine and freaking out my kid and the neighbours..
What it looks like
I built this into a set of Claude Code commands. The workflow is three steps:
# Plan: brainstorm with the agent, then generate structured artifacts
/aa-ma-plan "build a REST API for user authentication"
# Execute: work through each milestone, sync the files, commit
/execute-aa-ma-milestone
# Archive: move completed work to the done pile
/archive-aa-ma auth-api
Between planning and archiving, the agent reads the five files at the start of every session, updates them as it works, and commits after every task. Context survives across sessions. Decisions don't get re-litigated. The audit trail is there if you need it.
What this isn't
It's not a polished product. It's not a library you pip install (though there's a skeleton for that). It's a one-person project built around my own workflows, shared because it might save someone else the same frustration.
The overhead is real. Five files is more than zero files. But the alternative is re-explaining your architecture every Monday morning, and I can't afford that.
Credits
Diet-Coder planted the seed with those three files. Matt Pocock's skills repo helped shape how I organised the commands. Helix.ml informed the gate classification system. Full provenance is in the repo.
Take what's useful
The whole thing is on GitHub: aa-ma-forge. Clone it, try it, fork it, make it your own. There's an installer that deploys everything into your Claude Code setup with one command, and an uninstaller that reverses it cleanly.
Fair warning: maintenance will be sporadic. If I've gone quiet, I'm either deep in client work, arguing with an API, or the MS is having a louder day than usual. Pull requests welcome, but don't hold your breath on response times.
If it saves you time or sanity, consider donating to an MS charity. Small acts, big ripples.
PS. If you want cross-session memory retrieval rather than task execution structure, The 5th Element has a gitrepo: https://github.com/milla-jovovich/mempalace

Top comments (0)