Valentyn Solomko

Posted on Jul 4

session-indexer: giving Claude Code a memory that doesn't die with the project next door

#claudecode #go #developertools #ai

I come back to a project after a week off, and the first ten minutes always go the same way: scrolling through old sessions, trying to remember what we actually decided about X.

A simple rolling log — "here's what we did yesterday" — solves half of that. It doesn't solve "what have we discussed about this topic, across every session, ever?"

session-indexer is my attempt at the second half.

It's a small Go tool that indexes Claude Code session transcripts for each project into a local SQLite database and retrieves relevant chunks by semantic similarity when a new session starts.

Why not just use a centralized memory tool?

Tools like mempalace or agentmemory keep one shared store across every project and every agent. That architecture has one fatal flaw: if the central store dies, everything dies. ** A corrupted vector index or a crashed MCP server takes down memory for every project you have, simultaneously, and recovery is not trivial.

session-indexer does the opposite. .claude/sessions.db lives inside the project's own .claude/ directory. Append-only. If one project's DB gets corrupted, the fix is to delete it and rerun mine on the JSONL transcripts already on disk. Nothing else is affected.

How it works

Two hooks, wired into Claude Code's lifecycle:

Stop hook (session-index.sh) — when a session ends, mines the JSONL transcript into .claude/sessions.db, embedding each chunk via bge-m3 (Ollama) if it's available.
SessionStart hook (session-recall.sh) — when a new session opens, derives a search query from the current git branch name and recent commit messages, searches the index, and injects the top matching chunks as context. No manual query needed — it just knows roughly what you're about to work on.

Manual search works too, any time:

session-indexer search "config validation approach" --db .claude/sessions.db
# or from inside Claude Code:
/recall config validation approach

The CLI

session-indexer mine   <jsonl-path> --db .claude/sessions.db
session-indexer search <query>      --db .claude/sessions.db [--limit N] [--json]
session-indexer embed               --db .claude/sessions.db
session-indexer stats               --db .claude/sessions.db

mine runs with a 50-second deadline — storage is fast and unconditional; embedding respects the deadline, so it never blows past Claude Code's Stop-hook timeout. Chunks that miss the deadline are stored but flagged for backfill via embed — nothing silently disappears.

Search quality

When Ollama + bge-m3 are available, search ranks by cosine similarity over 1024-dimension multilingual embeddings — English and Ukrainian both score well, which matters when half my sessions are in one language and half in the other.

When Ollama isn't running, or the store has zero embeddings yet, search automatically falls back to FTS5 BM25 keyword matching. No configuration, no hard dependency on an external service that might not be running.

Stack

Go 1.26 — single static binary
SQLite (modernc.org/sqlite, pure Go, no CGO) — one file per project
Ollama + bge-m3 — optional vector embeddings
Cobra — CLI
FTS5 — automatic fallback

Setup

go install ./cmd/session-indexer

# wire the hooks (one-time, per project)
# copies session-index.sh + session-recall.sh into .claude/hooks/
# updates .claude/settings.local.json

Full hook-wiring steps are in the README — it's a few JSON entries in settings.local.json, nothing exotic.

Why this matters to me

Most AI coding assistants either forget everything the moment the session ends, or bolt on a centralized memory service that becomes a single point of failure for your entire workflow. I wanted the boring option: per-project, append-only SQLite. The same architectural instinct that makes Git itself reliable — if it breaks, it breaks alone, and it heals by re-mining transcripts that are already on disk.

Open source, MIT licensed.

GitHub: https://github.com/valpere/session-indexer

If you're already piling up weeks of Claude Code sessions per project and re-deriving the same decisions from scratch every time you come back, this might save you the ten minutes of scrolling.

Valentyn Solomko — Ukrainian software engineer

Top comments (1)

Mike Czerwinski • Jul 4

Per-project SQLite with re-mine as recovery is the right blast radius, and deriving the SessionStart query from branch plus recent commits is the move worth copying: query-at-load instead of read-everything.

One axis the index will eventually need: transcripts record what was said, not what still governs. "What did we decide about X" usually has two answers a few weeks apart, and cosine similarity returns both, ranked by wording rather than by which decision survived. Relevance is not authority. The cheap version inside your architecture: when several hits share a topic, sort by session date and label the older ones as earlier, possibly superseded. The expensive version is a supersedes edge between chunks, which is a different tool. But without at least the cheap version, the ten minutes of scrolling come back wearing a ranking.