DEV Community

CyborgNinja1
CyborgNinja1

Posted on

I built a brain for Claude Code because it keeps forgetting everything

The frustration that started this

If you use Claude Code for real work, you've hit this wall: you're deep in a session, you've made architectural decisions, debugged tricky issues, established patterns — and then context compaction happens. Claude summarizes your conversation to free up tokens, and suddenly it's forgotten that you switched from MongoDB to PostgreSQL three hours ago.

You explain it again. It forgets again. Repeat.

I got tired of re-explaining my own codebase to my AI assistant. So I built Claude Cortex — a memory system that works like a brain, not a notepad.

What it actually does

Claude Cortex is an MCP server that gives Claude Code three types of memory:

  • Short-term memory (STM) — session-level, high detail, decays within hours
  • Long-term memory (LTM) — cross-session, consolidated from STM, persists for weeks
  • Episodic memory — specific events: "when I tried X, Y happened"

The key insight: not everything is worth remembering. The system scores every piece of information for salience — how important it actually is:

"Remember that we're using PostgreSQL" → architecture decision → 0.9 salience
"Fixed the auth bug by clearing the token cache" → error resolution → 0.8 salience  
"The current file has 200 lines" → temporary context → 0.2 salience (won't persist)
Enter fullscreen mode Exit fullscreen mode

Memories also decay over time, just like human memory:

score = base_salience × (0.995 ^ hours_since_access)
Enter fullscreen mode Exit fullscreen mode

But every time a memory is accessed, it gets reinforced by 1.2×. Frequently useful memories survive. One-off details fade away. This isn't a key-value store — it's a system that learns what matters.

The compaction problem, solved

Here's the specific workflow that used to drive me nuts:

Before Cortex:

Session starts → Work for 2 hours → Compaction happens → 
Claude: "What database are you using?" → You: *screams internally*
Enter fullscreen mode Exit fullscreen mode

After Cortex:

Session starts → Work for 2 hours → Compaction happens →
PreCompact hook auto-extracts 3-5 important memories →
Claude: "Let me check my memory..." → 
Recalls: PostgreSQL, JWT auth, React frontend, modular architecture →
Continues working seamlessly
Enter fullscreen mode Exit fullscreen mode

The PreCompact hook is the secret weapon. It runs automatically before every compaction event, scanning the conversation for decisions, error fixes, learnings, and architecture notes. No manual intervention needed.

v1.6.0: The intelligence overhaul

The first version was essentially CRUD-with-decay. It worked, but the subsystems were isolated — search didn't improve linking, linking didn't improve search, salience was set once and never evolved.

v1.6.0 was a seven-task overhaul to make everything feed back into everything else:

1. Semantic linking via embeddings

Previously, memories only linked if they shared tags. Now, two memories about PostgreSQL with completely different tags will still link — the system computes embedding similarity and creates connections at ≥0.6 cosine similarity.

2. Search feedback loops

Every search now does three things:

  • Returns results (obviously)
  • Reinforces salience of returned memories (with diminishing returns)
  • Creates links between co-returned results

Your search patterns literally shape the knowledge graph.

3. Dynamic salience evolution

Salience isn't static anymore. During consolidation:

  • Hub memories (lots of links) get a logarithmic bonus
  • Contradicted memories get a small penalty
  • The system learns which memories are structurally important

4. Contradiction surfacing

If you told Claude "use PostgreSQL" in January and "use MongoDB" in March, the system detects this and flags it:

⚠️ WARNING: Contradicts "Use PostgreSQL" (Memory #42)
Enter fullscreen mode Exit fullscreen mode

No more silently holding conflicting information.

5. Memory enrichment

Memories accumulate context over time. If you search for "JWT auth" and the query contains information the memory doesn't have, it gets appended. Memories grow richer through use.

6. Real consolidation

The old system just deduplicated exact matches. Now it clusters related STM memories and merges them into coherent LTM entries:

STM: "Set up JWT tokens with RS256 signing"
STM: "JWT tokens expire after 24 hours"
STM: "Added JWT verification middleware"

→ Consolidated LTM: "JWT authentication system using RS256 signing.
   Tokens expire after 24 hours with 7-day refresh tokens.
   Verification middleware on all protected routes."
Enter fullscreen mode Exit fullscreen mode

Three noisy short-term memories become one structured long-term memory.

7. Activation weight tuning

Recently activated memories get a meaningful boost in search results. If you just looked at something, it's more likely to be relevant again.

Getting started

Install

npm install -g claude-cortex
Enter fullscreen mode Exit fullscreen mode

Configure Claude Code

Create .mcp.json in your project (or ~/.claude/.mcp.json for global):

{
  "mcpServers": {
    "memory": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "claude-cortex"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Set up the PreCompact hook

Add to ~/.claude/settings.json:

{
  "hooks": {
    "PreCompact": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "npx -y claude-cortex-hook pre-compact"
          }
        ]
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

Restart Claude Code, approve the MCP server, and you're done. Claude will start remembering things automatically.

Use it naturally

You don't need to learn new commands. Just talk to Claude:

"Remember that we're using PostgreSQL for the database"
"What do you know about our auth setup?"
"Get the context for this project"
Enter fullscreen mode Exit fullscreen mode

The system handles categorization, salience scoring, and storage behind the scenes.

The dashboard

There's also an optional 3D brain visualization dashboard — because honestly, watching memories form as glowing nodes in a neural network is just cool.

npx claude-cortex service install  # auto-start on login
Enter fullscreen mode Exit fullscreen mode

It shows your memory graph in real-time via WebSocket, with search, filters, stats, and even a SQL console for poking at the database directly. Memories are color-coded: blue for architecture, purple for patterns, green for preferences, red for errors, yellow for learnings.

How it compares

Most MCP memory tools are flat key-value stores. You manually save and manually retrieve. Claude Cortex is different in a few ways:

  • Salience detection — it decides what's worth remembering, not you
  • Temporal decay — old irrelevant stuff fades naturally
  • STM → LTM consolidation — short-term memories get merged into long-term ones
  • Semantic linking — memories form a knowledge graph, not a list
  • PreCompact hook — survives Claude Code's context compaction automatically

It's not perfect. Embeddings add some latency. The consolidation heuristics are tuned for my workflows and might need adjustment for yours. The dashboard is a nice-to-have, not a must-have. But for the core problem — Claude forgetting things it shouldn't forget — it works really well.

The stack

  • TypeScript, compiled to ESM
  • SQLite with FTS5 for full-text search
  • @huggingface/transformers for local embeddings (v1.6.1 fixed ARM64 support)
  • MCP protocol for Claude Code integration
  • React + Three.js for the dashboard
  • 56 passing tests, MIT licensed

Try it out

npm install -g claude-cortex
Enter fullscreen mode Exit fullscreen mode

If you're using Claude Code for anything beyond quick one-offs, give it a shot. The difference between an AI that remembers your project and one that doesn't is night and day.

Stars and feedback welcome — this is a solo project and I'm iterating fast.

Top comments (0)