I've been using Claude Code for months. It's incredible — until you start a new session.
"What architecture decisions did we make yesterday?"
"I don't have information about previous conversations."
Every. Single. Time.
I'd re-explain my project structure. Re-describe my conventions. Re-contextualize bugs I already fixed. I estimated I was spending 20-30% of my Claude Code time just re-establishing context that the AI had already processed.
So I built UltraBrain.
What is UltraBrain?
UltraBrain is an open-source plugin for Claude Code that gives your AI persistent memory across sessions. It runs silently in the background through 5 lifecycle hooks:
SessionStart → UserPromptSubmit → PostToolUse → Stop → SessionEnd
Every session, it:
- Captures tool usage, file changes, and Claude's reasoning
- Compresses raw data into semantic observations (via AI — free with Groq)
- Embeds observations into LanceDB for vector similarity search
- Injects the most relevant context when your next session starts
Claude sees a concise summary of your project history at the start of every session. No manual intervention required.
The numbers
| Metric | Result |
|---|---|
| Vector search latency | 1.3ms |
| Context injection | <50ms |
| Token savings | ~80% vs raw context |
| AI processing cost | $0 (Groq free tier) |
| Setup time | 2 commands |
It's more than memory
What started as a memory layer turned into a full development command center:
- Project Management — Bugs, todos, ideas, and learnings automatically extracted from your sessions
- Kanban Board — Drag-and-drop tasks auto-created from AI observations
- CLAUDE.md Manager — Browse and edit all 7 tiers of CLAUDE.md files
- Ralph Loop — Autonomous coding iterations launched from the dashboard
- Auto-Tagging — AI classifies every observation (bug, todo, idea, learning, etc.)
- Mission Control — Web terminal, automation engine, analytics, session recording, knowledge graph
How it works under the hood
The core insight: you don't need to store everything. You need to store the right things in the right format.
UltraBrain uses a compression pipeline:
Raw tool calls (thousands per session)
↓ AI compression (Groq, free)
Semantic observations (~5-15 per session)
↓ Embedding (all-MiniLM-L6-v2, local ONNX)
384-dimensional vectors in LanceDB
↓ Similarity search (<2ms)
Top-k relevant observations
↓ Progressive disclosure
Injected context (~300 tokens)
The AI never sees raw data. It sees compressed, relevant, ranked observations. This saves ~80% of tokens compared to naive context loading.
Stack
- TypeScript — Hooks, worker, everything
- LanceDB — Native Rust vector engine, runs in-process
- Bun SQLite — Local database, zero external dependencies
- ONNX Runtime — all-MiniLM-L6-v2 embeddings, in-process
- React — Dashboard UI (built to single HTML file)
- No Python — Zero. None. Nada.
Try it
/plugin marketplace add EconLab-AI/Ultrabrain
/plugin install ultrabrain
That's it. Two commands. Your AI never forgets again.
GitHub: https://github.com/EconLab-AI/Ultrabrain
License: MIT — free to use, modify, and distribute.
Contributions welcome — check the good first issues.
Top comments (0)