DEV Community

Scott Crawford
Scott Crawford

Posted on • Originally published at hifriendbot.com

5 Ways to Add Memory to Claude Code (Compared)

If you use Claude Code for anything beyond one-off scripts, you've hit the memory wall. Every session starts from zero. Context compaction destroys your working state. MEMORY.md caps out at 200 lines.

The good news: the community has built real solutions. The bad news: there are enough options that choosing one is its own time sink.

I've tested all of the major approaches. Here's an honest comparison — what works, what breaks, and which one fits your situation.

1. CLAUDE.md + MEMORY.md (Built-In)

What it is: Claude Code's native memory system. CLAUDE.md files hold project instructions; MEMORY.md (in ~/.claude/projects/) stores notes Claude writes to itself. Both load into the system prompt at session start.

Setup: Nothing. It's already there.

How it works:

  • CLAUDE.md at your project root — team-shared instructions, conventions, architecture notes
  • CLAUDE.local.md — personal notes, auto-gitignored
  • ~/.claude/CLAUDE.md — global preferences across all projects
  • MEMORY.md — auto-generated by Claude, loaded at session start

The good:

  • Zero setup, zero dependencies, zero cost
  • Works offline
  • CLAUDE.md is version-controlled with your project
  • Simple enough to understand in 5 minutes

The bad:

  • MEMORY.md has a hard 200-line cap. Content beyond line 200 is silently dropped. No warning. (Issue #25006)
  • No search. Claude reads the entire file every session. With 200 lines of notes, it has no way to find specific context by meaning.
  • Post-compaction amnesia. Multiple bug reports document Claude ignoring CLAUDE.md after context compaction. (Issue #4017, 20 upvotes)
  • No automatic extraction. Claude has to decide what to write down. Important context slips through constantly.
  • No cross-device sync. Each machine has its own disconnected MEMORY.md.
  • Hidden token cost. Every message re-sends the full CLAUDE.md. One developer found cache reads consuming 99.93% of total token usage.

Best for: Small projects, quick tasks, developers who don't want to install anything.

Verdict: Fine for getting started. Inadequate for serious, multi-session development.

2. Local Vector Database Solutions (~29,700 GitHub Stars)

What it is: The most popular third-party memory approach. Automatically captures session context, compresses it with Ai, and stores it in a local database with vector search.

Setup: Typically a single init command, but requires local dependencies.

How it works:

  • Hooks into Claude Code's session lifecycle
  • Captures conversation context and compresses it into summaries
  • Stores summaries in local databases with vector embeddings
  • Injects relevant context at session start

The good:

  • Massive community — tens of thousands of stars, actively maintained
  • Battle-tested across thousands of developers
  • Open source
  • Session summaries are automatic
  • Vector search finds relevant context

The bad:

  • Local dependencies. Requires multiple runtimes and databases running on your machine.
  • Resource consumption. Significant memory usage reported during long sessions, due to in-process vector stores.
  • No cross-device sync. Your memories live on one machine. Work from a laptop and desktop? Two separate memory stores.
  • Session-level granularity. Captures session summaries, not individual facts. You can't search for a specific architecture decision — you search for sessions that might have mentioned it.

Best for: Developers who want the most popular, proven approach and work from a single machine with plenty of RAM.

Verdict: The community standard. Solid choice if you don't mind local resource usage and single-machine limitations.

3. Other MCP Memory Servers (~1,200 GitHub Stars)

What it is: MCP servers providing persistent memory with knowledge graph features, semantic search, and autonomous memory consolidation.

Setup: Typically requires Python and multiple dependencies, with several configuration steps.

How it works:

  • Runs as an MCP server alongside Claude Code
  • Stores memories locally with vector embeddings
  • Provides tools for saving, searching, and managing memories
  • Includes knowledge graph relationships between memories
  • Autonomous consolidation merges related memories over time

The good:

  • Knowledge graph structure adds relationship context
  • Semantic search with local embeddings
  • Autonomous consolidation reduces memory bloat
  • MCP-native — works through the standard protocol

The bad:

  • Complex setup. Requires multiple local dependencies and configuration steps.
  • Stability varies. Some projects have experienced significant version churn and reliability issues.
  • Local-only. Same single-machine limitation as local vector database solutions.
  • Smaller community. Fewer people testing edge cases compared to the most popular solutions.
  • Heavy dependencies. Embedding model downloads can be hundreds of megabytes and may fail on some platforms.

Best for: Developers who want knowledge graph features and don't mind a more complex setup process.

Verdict: Ambitious architecture, but stability varies across projects. Test thoroughly before committing.

4. CogmemAi (Cloud-Based)

What it is: A cloud-first MCP server that moves all memory intelligence server-side. The local MCP server is a thin HTTP client — no databases, no vector stores, no heavy dependencies. Full disclosure: I built this one.

Setup:

npx cogmemai-mcp setup
Enter fullscreen mode Exit fullscreen mode

How it works:

  • 18 MCP tools: save, recall (semantic search), extract (Ai-powered), project context, analytics, knowledge graph, import/export, and more
  • Memories stored with high-dimensional semantic embeddings server-side
  • Ai extraction identifies important facts from conversations automatically
  • Smart deduplication detects duplicate and conflicting memories
  • Automatic project scoping + global preferences that follow you everywhere
  • Automatic compaction recovery — context is preserved before compaction and seamlessly restored afterward

The good:

  • Zero local setup. No databases, no Python, no Docker, no vector stores. One command.
  • Zero RAM issues. Nothing running locally except a thin HTTP client.
  • Cross-device sync. Memories are in the cloud. Work from any machine.
  • Compaction recovery. Automatically saves context before compaction and restores it after.
  • Semantic search. Find memories by meaning, not keywords.
  • Ai extraction. Automatically identifies facts worth remembering.
  • Document ingestion. Feed in READMEs or docs to quickly build project context.
  • Free tier: 1,000 memories, 500 extractions/month, 5 projects.

The bad:

  • Requires internet. No network, no memories. Not usable offline.
  • Data in the cloud. Your memories (short factual sentences, not source code) are stored on HiFriendbot's servers. If that's a dealbreaker, go local.
  • Newer project. Smaller community than the most popular local solutions. Fewer people have battle-tested it.
  • Paid tiers for heavy use. Free tier is generous (1,000 memories), but Pro is $14.99/mo for 2,000 memories. Best for: Developers who want zero-config setup, cross-device sync, and compaction recovery without managing local infrastructure.

Verdict: The trade-off is cloud dependency for zero maintenance. If you're comfortable with that, it's the fastest path to persistent memory.

5. Roll Your Own

What it is: Build a custom memory system tailored to your exact needs. Popular approaches include markdown file collections, SQLite databases with FTS5, or even Neo4j knowledge graphs.

Setup: However long it takes you to build it.

Common approaches:

  • Markdown files + grep. Maintain a /memory/ directory with topic-based markdown files. Simple, version-controlled, human-readable. No semantic search.
  • SQLite + FTS5. Store memories in SQLite with full-text search. Good for keyword matching, misses semantic similarity.
  • Custom MCP server. Build an MCP server that wraps your preferred storage backend. Full control, full responsibility.
  • Obsidian vault. Some developers use Obsidian's knowledge graph as a project memory, connected via MCP servers like easy-obsidian-mcp.

The good:

  • Complete control over storage, format, and retrieval
  • No vendor dependency
  • Can be exactly what you need and nothing more
  • Educational — you learn how memory systems work

The bad:

  • Time investment. Building a good memory system is a project in itself. Semantic search alone requires embedding models, vector storage, and retrieval logic.
  • Maintenance burden. You own every bug, every upgrade, every edge case.
  • No Ai extraction. Unless you build it, you're manually deciding what to remember.
  • No compaction recovery. You'd need to build that system yourself.

Best for: Developers with specific requirements that no existing tool meets, or those who want to learn by building.

Verdict: Maximum flexibility, maximum effort. Only worth it if the existing tools genuinely don't fit.

The Comparison Table

Feature CLAUDE.md Local Vector DB Other MCP Servers CogmemAi DIY
Setup time 0 min ~5 min ~15 min ~1 min Hours/days
Local dependencies None Multiple (databases, runtimes) Multiple (Python, embeddings) None Varies
Semantic search No Yes (local) Yes (local) Yes (cloud) If you build it
Ai extraction No Session summaries No Yes If you build it
Compaction recovery No Partial No Yes If you build it
Cross-device sync No No No Yes If you build it
Works offline Yes Yes Yes No Varies
RAM usage None Significant Moderate None Varies
Memory capacity 200 lines Unlimited (local disk) Unlimited (local disk) 1,000 free / 50K enterprise Unlimited
Project scoping Per-directory Per-project Tags Git remote + global If you build it
Cost Free Free Free Free / $14.99+ Your time

My Recommendation

There's no universally "best" option. It depends on what you value:

  • "I don't want to install anything." → Stick with CLAUDE.md. Maximize those 200 lines. Use .claude/rules/*.md for topic-scoped instructions.
  • "I want the most proven solution." → Local vector database solutions. Huge community, active development. Accept the resource trade-off.
  • "I want zero maintenance." → CogmemAi. One command, nothing local to break, memories follow you across machines.
  • "I need knowledge graphs." → Other MCP memory servers, but test the current version first.
  • "I have specific requirements." → Roll your own. Start with SQLite + FTS5 and add complexity as needed.

The worst option is no memory at all. If you're still re-explaining your architecture every session, pick any solution from this list and set it up today. The 5–15 minutes of setup will save you hours every week.


I'm Scott, a network and systems engineer with 30+ years in the industry. I built CogmemAi after testing every approach on this list and wanting something with zero local infrastructure. Try whichever fits your workflow — the important thing is to stop losing context.

Top comments (0)