varun pratap Bhardwaj

Posted on Mar 18

I Added Persistent Memory to Claude Code in 60 Seconds (and It Actually Works)

#claude #code #ai #productivity

Claude Code forgets everything between sessions. Here's how I fixed it with one command and a local database that never leaves my machine.

The Problem

If you use Claude Code daily, you've felt this: every session starts from zero. You re-explain your codebase architecture. You remind it which patterns you prefer. You tell it again that you use uv not pip. Again. Every. Single. Session.

Claude Code is exceptional at reasoning within a session. Across sessions, it has no memory at all.

I built SuperLocalMemory to fix this. Here's the 60-second setup.

Setup (One Command)

npm install -g superlocalmemory
slm setup

That's it. slm setup downloads the embedding model (~275MB, one-time), initializes the local database, and registers the MCP server. Everything runs on your machine — no API keys, no cloud, no accounts.

Connect to Claude Code

Add to your Claude Code MCP config (~/.claude/settings.json or via the UI):

{
  "mcpServers": {
    "superlocalmemory": {
      "command": "slm",
      "args": ["mcp"]
    }
  }
}

Restart Claude Code. You'll see SuperLocalMemory tools appear in the toolbar.

Start Using It

The workflow is simple: tell Claude what to remember, ask it to recall later.

You: slm remember "This project uses uv not pip. Always use uv run python."
Claude: ✓ Stored memory about package manager preference.

[Next session, next week]

You: What package manager does this project use?
Claude: [calls slm recall] Based on your stored preferences: this project uses uv, not pip.

Or use the CLI directly:

# Remember something
slm remember "Auth is handled in middleware/auth.py — JWT, 24h expiry"

# Recall anything
slm recall "where is auth handled"

# See all memories
slm list

# Open the 17-tab dashboard
slm dashboard

What Actually Gets Stored

Claude Code can call the memory tools autonomously as it works. Common patterns that work well:

Project context:

slm remember "Monorepo: packages/api (FastAPI), packages/web (Next.js 15), packages/worker (Celery)"
slm remember "Production: Railway (API + worker), Vercel (web), Upstash Redis"
slm remember "Database: Supabase Postgres. Migrations with Alembic."

Preferences:

slm remember "Always use type hints. Prefer dataclasses over plain dicts."
slm remember "Test with pytest. Use pytest-asyncio for async tests. 80% coverage minimum."
slm remember "No print statements in production code — use structlog."

Decisions:

slm remember "Decided against GraphQL — REST is simpler for this use case (March 2026)"
slm remember "Payment processing uses Stripe not Paddle — existing contracts"

Bugs and fixes:

slm remember "Fixed: celery worker crashes if Redis connection drops during task — added retry with exponential backoff"

The Dashboard

slm dashboard

Opens a 17-tab web UI at localhost:8765. The tabs I use most:

Memories — browse all stored facts, edit or delete inline
Recall Lab — test queries before using in code sessions
Knowledge Graph — visual map of entities and relationships
Trust Dashboard — Bayesian trust scores per memory source

How It Actually Works (For the Curious)

Standard memory systems use cosine similarity over embeddings — it works but degrades at scale. SuperLocalMemory uses three mathematical techniques from our research paper:

Fisher-Rao geodesic distance — models each memory as a Gaussian distribution, not a point. Frequently accessed memories become more precise (variance shrinks via Bayesian updates). The result: the system gets better at finding things the more you use it.

Sheaf cohomology — detects contradictory memories globally, not just pairwise. If you've stored conflicting facts ("Auth uses JWT" and "Auth uses sessions"), the system surfaces the conflict.

Langevin dynamics — self-organizes memory lifecycle based on actual usage. Frequently accessed memories stay active; stale ones archive automatically.

On the LoCoMo benchmark, this approach achieves 74.8% accuracy with data staying fully local — higher than Mem0's 58-66% while requiring zero cloud dependency.

Three Modes

By default you get Mode A (fully local, no cloud):

slm mode a   # Default — zero cloud, 74.8% on LoCoMo
slm mode b   # + local Ollama LLM for synthesis (still private)
slm mode c   # + cloud LLM for synthesis (87.7% on LoCoMo)

I use Mode A. It's fast (sub-millisecond retrieval), private, and works offline. I switch to Mode B when I want the Ollama-powered cluster summaries in the dashboard.

Works With More Than Claude Code

The same memory layer works with every AI tool that supports MCP:

Cursor — add the same MCP config to Cursor settings
VS Code Copilot — via Continue.dev extension
Windsurf, Zed, JetBrains AI — same config
ChatGPT Desktop — via MCP bridge

One memory store, all your tools.

MIT License, Fully Open Source

npm install -g superlocalmemory   # npm
pip install superlocalmemory      # Python

GitHub: github.com/qualixar/superlocalmemory
Website: superlocalmemory.com
Paper: arXiv:2603.14588

1400+ tests. No telemetry. No accounts. Data never leaves your machine in Mode A.

Varun Pratap Bhardwaj — Independent Researcher
A Qualixar Research Initiative

DEV Community