My AI Agent Keeps Forgetting Everything; So do I...
I have multiple sclerosis. Some days are better than others, but one thing is constant: repe...
For further actions, you may consider blocking this person and/or reporting abuse
Shout out to @diet-code103!!
This is a step but there will always be a challenge to make AI work consistently in the long run. Almost everything that happens with software engineering involves subjective decisions and these hallucinations and inconsistencies prove this.
A compressed knowledge graph, particularly on MS as .md level...I would be happy to do that for you to improve your memory issue.
I think you need to explain that a bit more clearly for the rest of us to understand - are you proposing a different (or "better", even) approach than what the author proposed?
Well I don’t need to do a thing. However, I’d that a kind request of further explanation?
No you don't have to do anything, but you could ;-)
My point basically is that the author already seems to have a pretty good grasp of the issue, and how to tackle it :-)
Fair point — let me explain.
The author's five-file structure is excellent execution tracking. What I was gesturing at is a different layer: instead of storing project context as flat markdown files, you compress it into a knowledge graph — nodes and edges representing concepts, decisions, and relationships, serialized as .md.
The practical difference: flat files grow linearly. A knowledge graph stays compact because relationships replace repetition. The agent doesn't re-read "we use Postgres" buried in a decisions log — it traverses a typed edge from DatabaseChoice → Postgres with the rationale attached. Context retrieval becomes a graph query, not a document scan.
So not a better approach — a different abstraction built on a similar idea. Stephen's five-file structure could sit underneath a KG layer: the files feed the graph, the graph feeds the agent.
The MS angle was specific: for someone managing cognitive fatigue, a compressed, queryable knowledge graph reduces the mental overhead of re-orienting the agent each session. Less to re-explain, because the structure carries more of the context automatically.
Thank you, that makes a lot of sense:
"A knowledge graph stays compact because relationships replace repetition"
Impressive, both Diet-Coder's effort and yours ...
With all of these separate efforts going on I start wondering if it's time for Anthropic to pull together some sort of "standard" and baking it into CC ? Because right now everyone seems to be scrambling to reinvent this wheel, with different approaches and different ambition levels ...
This hits way too close.
My biggest frustration isn’t even “new session = no memory” — I’m used to that.
It’s when the agent forgets things inside the same session / project flow.
I’ll explain architecture, constraints, decisions — everything looks aligned.
Then 20–30 messages later it starts drifting, ignores earlier decisions, or straight up contradicts them.
That’s where it becomes painful, because it’s not just context loss — it’s trust loss.
And I’ve tried the usual fixes:
• long system prompts
• “single source of truth” docs
• summaries
But like you said — they mix static knowledge with dynamic state, and the agent just can’t prioritize what matters.
The idea of separating memory by type instead of just “more context” makes a lot of sense.
Curious — have you noticed this helping with in-session drift, or mostly across sessions?
This resonates — we hit the exact same primitive from a different angle.
Your AA-MA solves "how does a single agent keep its own memory across
sessions." We hit the same wall (Markdown + structure + separation by
behavior type) trying to solve a different problem: how do N agents
coordinate without a broker.
The core insight we converged on independently:
sender-to-recipient) — directory encodes statusBoth exploit the same fact: the filesystem is already a state machine.
renameis atomic (POSIX).lsis a full diagnostic. You getvisibility + atomicity + zero infra, if you stop trying to mediate
everything through a chat context.
And your "None of this was designed upfront — each piece was bolted on
after a failure made it obvious" is the exact pattern we observed. After
48 hours of 4 Cursor agents running on a minimal rulebook, they had
invented 6 coordination patterns we hadn't written (broadcast addressing,
anonymous role slots, traceability frontmatter, subtask sub-folders…).
All of them surfaced as new filenames in a shared folder. None of this
is designable. It emerges.
Field report + MIT protocol: github.com/joinwell52-AI/FCoP
Genuinely curious what happens if AA-MA's per-task 5-file memory sits
underneath FCoP's routing layer. Feels like they compose, not conflict.
Re @leob's "time for a standard?" — I suspect this won't come from
Anthropic, because the whole point is tool-neutral. If it works across
Claude Code, Cursor, and Codex, it has to come from users. Which is
what we're both doing :)
The distinction you're drawing here — separating knowledge by behavioral type (what changes vs. what doesn't) — is the insight that most "just use CLAUDE.md" advice misses. Treating a single instruction file as both strategy and execution state creates the hallucination problem you described: the agent can't tell the difference between a settled architectural decision and current task state.
The five-file structure maps well to how working memory actually functions: long-term facts, deliberate decisions, current focus, planning, and audit trail. What strikes me is that this is really typed memory — you're enforcing contracts between information types so the agent can't confuse "we always use postgres" with "this PR is still in review."
One thing I've found useful on a similar structure: a versioned decisions log where you append rather than overwrite. If an agent re-litigates a settled decision, you can trace exactly when and why it was resolved — helpful during post-mortems when you're not sure whether the agent worked from stale context or genuinely hit an edge case.
The part about this emerging from real regulated-industry failures rather than theoretical design resonates — these patterns always look obvious in retrospect.