I build AI agent systems that replace manual workflows across entire companies - marketing, IT, design, content, sales.
- Prototype solutions that cut department workload by 5-10x
- Build multi-agent
Good point, Iurii — and yeah, I'm aware of Anthropic's memory MCP server reference implementation.
The RAG/vector approach is tempting, especially for the semantic retrieval you're describing (ask about "ORM" and get "database" + "migrations" back). In theory, it's cleaner than flat files.
But here's what we found in practice: for most Claude Code workflows, the overhead of maintaining a vector DB (embeddings, indexing, retrieval pipeline) doesn't pay off. Here's why:
Claude already does semantic matching — when the agent reads MEMORY.md index and sees a topic file called database.md, it knows to pull it when you ask about ORM or migrations. The model's own understanding of semantic relationships handles 90%+ of the routing without any embeddings.
The bottleneck isn't retrieval, it's curation — the hard part isn't finding the right memory, it's keeping memory clean and useful over time. A vector DB with 2000 noisy entries retrieves noisy results. Our curated 200-line index + focused topic files stays sharp because the agent is told exactly what to save and what to skip.
Zero infrastructure — no embedding model, no vector store, no indexing step. Just markdown files in a git repo. Works offline, syncs with git, readable by humans. For a solo dev or small team, that simplicity matters a lot.
That said — for large codebases where you need to search across thousands of files semantically, a vector layer absolutely makes sense. It's more of a "what scale are you at" question. For project-level memory (decisions, patterns, preferences), flat files win. For codebase-level search across 100K+ lines, yeah, embeddings would help.
Would be cool to see someone build a hybrid — flat file memory for project context + vector search for codebase navigation. Best of both worlds.
Claude already does semantic matching. Of course, in theory a vector DB would spare context window better. I’m curious about real-world results, but not curious enough to build it myself 😀
This is a very valid topic. I’ve observed how quickly CLAUDE.md can “rot” under rapid changes (something like upgrading a framework to a new major version is one prompt away). Human-readable mismatch is easier to spot.
I don’t see the infrastructure as the issue, but plain Markdown actually benefits teams more (everyone has the same files vs. everyone having their own unique binary database).
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Claude has reference implementation of memory MCP server.
I would very much like to see a full RAG vector memory (kind of what AnythingLLM does but for code specifically).
So that I could ask about "ORM" and Claude would retrieve "database" and "migrations" topics from the memory
Good point, Iurii — and yeah, I'm aware of Anthropic's memory MCP server reference implementation.
The RAG/vector approach is tempting, especially for the semantic retrieval you're describing (ask about "ORM" and get "database" + "migrations" back). In theory, it's cleaner than flat files.
But here's what we found in practice: for most Claude Code workflows, the overhead of maintaining a vector DB (embeddings, indexing, retrieval pipeline) doesn't pay off. Here's why:
Claude already does semantic matching — when the agent reads
MEMORY.mdindex and sees a topic file calleddatabase.md, it knows to pull it when you ask about ORM or migrations. The model's own understanding of semantic relationships handles 90%+ of the routing without any embeddings.The bottleneck isn't retrieval, it's curation — the hard part isn't finding the right memory, it's keeping memory clean and useful over time. A vector DB with 2000 noisy entries retrieves noisy results. Our curated 200-line index + focused topic files stays sharp because the agent is told exactly what to save and what to skip.
Zero infrastructure — no embedding model, no vector store, no indexing step. Just markdown files in a git repo. Works offline, syncs with git, readable by humans. For a solo dev or small team, that simplicity matters a lot.
That said — for large codebases where you need to search across thousands of files semantically, a vector layer absolutely makes sense. It's more of a "what scale are you at" question. For project-level memory (decisions, patterns, preferences), flat files win. For codebase-level search across 100K+ lines, yeah, embeddings would help.
Would be cool to see someone build a hybrid — flat file memory for project context + vector search for codebase navigation. Best of both worlds.
Thanks for the answer — makes total sense to me.
CLAUDE.mdcan “rot” under rapid changes (something like upgrading a framework to a new major version is one prompt away). Human-readable mismatch is easier to spot.