Don't summarize your memory — search it

#ai #claude #opensource #tooling

Every long session with an AI coding agent eventually hits the same wall: the context window fills up, the conversation gets compacted, and a summary takes the place of what actually happened. Summaries are lossy by design. The decision you made three sessions ago, the reason you ruled out approach B, the exact path you fixed last Tuesday — quietly gone, because something decided they weren't important enough to keep.

I got tired of re-explaining my own project to my own assistant. So I built brethof-mind: long-term memory for Claude Code (and Claude Desktop), built on SurrealDB. The core idea is in the title — instead of summarizing your history down to fit, keep all of it and search it.

It's open source (MIT), runs 100% on your machine, and talks to no external API.

🔗 https://github.com/BrethofAI/brethof-mind

Two memories, not one

Most "memory" tools give you a single bucket of notes. brethof-mind keeps two layers, because they answer different questions:

Curated memory — the things you decide are worth pinning: architecture decisions, locked rules, project status, bugs and their fixes. Small, high-signal, hand-or-agent-curated.
Full chat archive — every session you've ever had, stored verbatim and searchable. This is the safety net: when a summary would have dropped a detail, the raw exchange is still there to retrieve.

The curated layer answers "what did we decide?" The archive answers "what did we actually say back in March?" Together they mean a compaction is no longer a memory wipe — it's just the working context shrinking while the real record stays intact.

Three ways to search

Different questions want different retrieval. brethof-mind exposes all three over MCP:

Full-text (BM25) — when you know the words. SurrealDB's full-text index, lowercased + stemmed.
Vector similarity (HNSW) — when you know the meaning but not the words. Embeddings come from fastembed (all-MiniLM-L6-v2, 384-dim) — local, fast, no API key.
Graph traversal — records link to each other (decision → supersedes → decision; episode → covers → topic), so you can walk relationships, not just match text.

There are 7 MCP tools in total — semantic_search, search_memory, search_chat, query_raw, save_memory, save_commit, load_project — so the agent can pick the right retrieval for the question instead of being stuck with one.

100% local stack

No cloud, no telemetry, no keys leaving your box:

SurrealDB for storage (vector + full-text + graph in one engine).
fastembed for embeddings, on CPU.
FastMCP over stdio for the server.
Credentials via env vars; projects configured in a simple projects.json.

Install

git clone https://github.com/BrethofAI/brethof-mind
cd brethof-mind

# 1. Bring up SurrealDB
docker compose up -d

# 2. Configure
cp .env.example .env                    # set DB creds
cp projects.example.json projects.json
python mcp-server/scripts/init_db.py    # create namespace + schema + indexes

# 3. Register the MCP server with Claude Code (claude mcp add ...)
# 4. Drop the hooks into your Claude settings (see settings.example.json)

Full steps are in the README.

The hooks are where it gets nice

The MCP tools are useful on demand, but the hooks make memory ambient — you don't have to remember to remember:

SessionStart loads the relevant project memory into context the moment you open a session.
UserPromptSubmit nudges the agent to search memory first before answering questions about past decisions.
Stop archives the session into the searchable chat history when you're done.
A commit hook records each commit as a memory record, so your project history and your conversation history live in the same searchable place.

The result: start a fresh session and your agent already knows where the project stands — no re-briefing.

Works with

Claude Code and Claude Desktop today (Desktop runs the Claude Code engine under the hood, so it gets the full hooks experience). OpenClaw and Hermes integrations are next.

Why it's free

brethof-mind is MIT and will stay free. It comes from the team behind Brethof Voice Pro (local, offline voice-to-text) — same principle: your data stays on your machine. This is the tooling we use ourselves; sharing it because the "summarize-your-memory" default deserves a better answer.

If you try it, I'd genuinely like feedback on the hook design — that's the part with the most room to get smarter.

https://github.com/BrethofAI/brethof-mind