I use Claude Code a lot.
One thing kept annoying me: not the mistakes, not the occasional wrong assumption, not even the weird confidence.
The annoying part was having to re-explain the same project context in new sessions.
Things like:
- this module looks legacy but still supports a critical flow
- this query already caused performance issues
- this test failed before because the hook returned formatted text, not an array
- this architecture decision looks strange, but it exists for a reason
- this project separates bugs from improvements in release notes
- do not touch this config unless you understand the install flow
Some of that belongs in CLAUDE.md.
But not all of it.
The problem with putting everything in CLAUDE.md
CLAUDE.md is great for stable project instructions:
- how to run the project
- how to run tests
- coding conventions
- architecture guidelines
- commands the agent should know
- repo-specific workflows
That kind of context is stable and broadly useful.
The problem starts when CLAUDE.md becomes the place for every pitfall, debugging note, warning, workaround, decision, and preference.
At that point, it stops being onboarding context and becomes a giant context dump.
That creates two problems.
First, every new session pays for it in tokens, even when the current task only needs one small detail.
Second, the more context you throw in, the easier it is for the important bit to get ignored.
More context is not always better context.
Sometimes it is just a bigger haystack with the same needle inside it.
The split that made more sense to me
I started thinking about project context as two different things:
CLAUDE.md = stable onboarding instructions
working memory = retrieved project-specific notes
CLAUDE.md should explain how the project works.
Working memory should remember what happened while working on the project.
That includes things like:
- decisions
- facts
- patterns
- pitfalls
- architecture notes
- project preferences
- session summaries
The key difference is that working memory should not be dumped into every prompt.
It should be searched, ranked, and injected only when relevant.
So I built Memento MCP
Memento MCP is a local-first MCP server that gives Claude Code and other stdio-MCP clients persistent project memory.
The basic idea is simple:
- Store useful project knowledge as typed memories.
- Search and rank memories for the current task.
- Inject only the relevant memory into the agent.
- Avoid turning every new session into a giant repeated context paste.
Default setup:
- local SQLite
- FTS5 search
- no mandatory cloud account
- no hosted vector DB
It also supports:
- optional embeddings
- team memory sync through git
- Obsidian vault indexing
- privacy controls
- local web inspector
Example
Instead of adding this to CLAUDE.md forever:
The scheduling module looks legacy but still supports a critical production flow.
Do not rewrite it casually.
The pagination query caused performance issues before.
The release notes must separate bugs from improvements.
Those notes can live as typed memories.
Then, when the agent is working on the scheduling module or release notes, the relevant memory is retrieved.
When the agent is working on something unrelated, that context stays out of the prompt.
That is the part I care about most: reducing repeated context without losing important project knowledge.
GitHub:
https://github.com/lfrmonteiro99/memento-mcp
Docs:
https://lfrmonteiro99.github.io/memento-mcp
What I want feedback on
I am mainly trying to validate the workflow.
The questions I care about:
- Does the
CLAUDE.mdvs working memory split make sense? - Would you trust an MCP server to inject memory into Claude Code?
- What kind of project memory would you actually want an agent to remember?
- What would make this annoying, unsafe, or too noisy?
- Should memory be mostly explicit/manual, or should the agent be allowed to suggest memories automatically?
I built this because I got tired of re-explaining the same project context over and over again.
Not because agents need more magic.
They mostly need better memory, fewer repeated instructions, and less context shoved into every prompt like we are packing for the apocalypse.
Top comments (0)