DEV Community

michielinksee
michielinksee

Posted on

I Built an MCP Server to Stop Claude Code from Forgetting Everything Between Sessions

TL;DR

  • Problem: Claude Code starts every session cold. It re-suggests libraries I already rejected, asks "should we use X?" about decisions made weeks ago, and re-reads unchanged files from scratch (paying full tokens each time).
  • Solution: I built linksee-memory — an MCP server with 6 structured memory layers and chunk-level file diff caching. Local SQLite, MIT, no cloud. Works across Claude Code / Cursor / Codex / Gemini CLI from the same database.
  • The feature that actually changed my workflow: the caveat layer — forget-protected entries for "never do this again". Cut repeat-suggestions of bad patterns to zero.
  • Install: npm install -g linksee-memory

The problem: my agent kept making the same mistake

Three Mondays in a row last month, I explained the same deploy failure to Claude Code. The root cause was identical each time: our production Cloudflare Workers had a memory-cache incoherence issue between distributed instances. Each session, I walked through the same investigation — same files, same logs, same 30-minute trail to the same conclusion.

Claude Code doesn't remember previous sessions. So every Monday morning, I was re-discovering the same bug with my agent from scratch.

I tried the available solutions; none fit:

Tool What's good Where it fell short for me
CLAUDE.md Zero setup, official Flat structure, model ignores parts of it
Mem0 Hosted, easy install Cloud-only, no "pain memory" concept
Letta Built into an agent framework Can't share memory across MCP clients
Zep Graph-based, strong relationships Single-client; I also use Cursor and Codex

All good products optimizing for different things. But the one feature I wanted most — a guarantee that I'll never repeat the same mistake — wasn't in any of them.

So I built linksee-memory.

The design: 6 layers, with caveat as the hero

linksee-memory organizes memory into 6 explicit layers:

+-- goal           Why are we doing this?
+-- context        Current situation & constraints
+-- emotion        User's mood, tone of the relationship
+-- implementation How we built it (success or failure)
+-- caveat         Pain lessons (forget-protected)
+-- learning       Growth log & realizations
Enter fullscreen mode Exit fullscreen mode

The key design principle is WHY-first. Other tools store facts ("we use PostgreSQL"). linksee-memory separates the WHY ("we chose PostgreSQL because the workload is OLTP with strict consistency needs") from the WHAT ("connection pool: 20, timeout: 30s").

The caveat layer is special. Entries there are permanently protected from auto-forgetting — even when old memories decay and get consolidated, caveats stay forever. This is how I enforce "never make this mistake again" as a structural property, not prompt discipline.

3 tools, not 8

Early versions had 8 separate tools (remember, update_memory, forget, recall, recall_file, list_entities, consolidate, read_smart). This worked fine in Claude Code where I could teach tool selection via SKILL.md.

But Cursor couldn't tell recall from recall_file. Codex mixed up remember and update_memory. Too many tools = LLM tool selection breaks down.

v0.7 unified everything into 3 tools:

Tool What it does
remember Create, update, or delete a memory. Mode auto-detected from params.
recall Search memories, get file history, or list entities. Mode auto-detected from params.
read_smart Read a file with chunk-level diff caching.
// Create
remember({ entity_name: "MyProject", layer: "caveat", content: "..." })

// Update (was: update_memory)
remember({ memory_id: 42, content: "updated content" })

// Delete (was: forget)
remember({ forget: true, memory_id: 42 })

// Search (was: recall)
recall({ query: "RLS policy", layer: "caveat" })

// File history (was: recall_file)
recall({ path: "server.ts" })

// Entity overview (was: list_entities)
recall({})
Enter fullscreen mode Exit fullscreen mode

One tool name per intent. Every LLM client handles it correctly now.

The part that surprised me: read_smart()

Structured memory is half the tool. The other half is file-diff caching.

Think about what your agent does at the start of every session:

  1. Re-reads package.json
  2. Re-reads your main entry file
  3. Re-reads the config
  4. Re-reads the tests

Each file is maybe 500-2000 lines. Each session, you pay full tokens to re-read all of them. But in most sessions, most of these files haven't changed.

read_smart() fixes this with chunk-level caching:

// First read: full file, chunked and cached
const r1 = read_smart({ path: "src/http-server.ts" });
// -> full content, ~3400 tokens

// Same session, no change: cache hit
const r2 = read_smart({ path: "src/http-server.ts" });
// -> { status: "unchanged", tokens_used: ~50 }

// After editing 2 functions: only changed chunks returned
const r3 = read_smart({ path: "src/http-server.ts" });
// -> only the 2 changed function chunks, ~340 tokens
Enter fullscreen mode Exit fullscreen mode

Chunk boundaries are language-aware:

  • Code (TS / JS / Python / etc.): AST-based, one chunk per function or class
  • Markdown: one chunk per h2 / h3 section
  • JSON / YAML: one chunk per top-level key

Cache keys are sha256(chunk_content). In practice, I see ~86% token reduction on file re-reads.

Concrete example: the caveat that stopped Claude from repeating

Here's a real example of caveat working.

A few weeks ago, for a one-off data migration task, Claude suggested "let's set up a cron job". One-off tasks shouldn't use cron (auth rotation overhead, monitoring cost, retry logic that doesn't match one-off semantics).

I stored one caveat entry:

Don't propose cron for one-off tasks. Alternatives: GitHub Actions workflow_dispatch, or a manual script with a completion notification.

In the 4 sessions before I added that caveat, Claude suggested cron 4 times for one-off tasks.

In the 3 weeks since? Zero.

And because all my LLM clients share the same SQLite file, the caveat also works in Cursor, Codex, and Gemini CLI — not just Claude Code.

Install (2 minutes)

Claude Code

npm install -g linksee-memory
claude mcp add -s user linksee -- npx -y linksee-memory
Enter fullscreen mode Exit fullscreen mode

Cursor

Settings -> Features -> "Model Context Protocol" -> Edit:

"linksee": {
  "command": "npx",
  "args": ["-y", "linksee-memory"]
}
Enter fullscreen mode Exit fullscreen mode

OpenAI Codex

codex mcp add linksee-memory -- npx -y linksee-memory
Enter fullscreen mode Exit fullscreen mode

Gemini CLI

~/.gemini/settings.json:

{
  "mcpServers": {
    "linksee": {
      "command": "npx",
      "args": ["-y", "linksee-memory"]
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Everything runs locally. SQLite file at ~/.linksee-memory/memory.db. No cloud, no API key, no telemetry. MIT licensed.

Because memory lives in a single SQLite file, it's shared across all MCP clients on your machine. My Claude Code sessions and my Cursor sessions see the same memory.

I'd genuinely love feedback

I've been using this daily for 3+ months, and my results are biased by the fact that I designed it for my own workflow. I'd like to know:

  • Does the caveat layer actually prevent your agent from repeating mistakes, or am I pattern-matching on one dataset (me)?
  • How does the chunk cache behave on your codebase? Monorepos, generated code, notebooks — I'd love bug reports.
  • Is 6 layers too many? Too few? Are there memory types you want that none of these cover?

Open an issue on GitHub, or ping me on X at @ELLECraftsinga1. I respond to everything.


Top comments (0)