I built a free self-hosted memory system for Claude CLI — replaces Supermemory/Mem0/Zep

#vscode #ai

## What it does

Hybrid search: FTS5 full-text + vector similarity + importance + recency + access frequency
4 memory types (static, dynamic, episodic, semantic)
4 importance levels (critical → low)
Memory deduplication & auto-merge
Memory graph — link related memories together
Time-decay scoring (recent memories rank higher)
Auto semantic chunking for long texts
LLM-based fact extraction from conversations
Soft delete + restore
Export/import, bulk ops

## Claude CLI integration

MCP server with 8 tools (search, add, update, delete, profile, list, link, stats)
Claude automatically loads your profile at session start
Memories persist across sessions in SQLite
Drop-in .mcp.json — just restart Claude Code

## Also works with any AI app

Vercel AI SDK middleware (6 tools)
OpenAI middleware (auto-inject memories)
Framework-agnostic wrapper (Anthropic, Gemini, Ollama, anything)

## Tech

SQLite + FTS5 + better-sqlite3
5 embedding options (local BM25, OpenAI, Ollama, hybrid)
Zero cloud dependencies
36 tests, all passing

It's a single-folder drop-in. No Docker, no cloud, no API keys needed.

GitHub: https://github.com/blockmandev/ai-memory

Would love feedback from the community. What features would you add?

Top comments (1)

Ned C • Feb 13

The hybrid search approach is smart. Most memory systems I've seen go all-in on either vector similarity or keyword search, and both have blind spots on their own. FTS5 for exact recall + vectors for semantic similarity covers the gaps nicely.

Curious about the deduplication logic. When two memories are semantically similar but phrased differently (like "use named exports" vs "avoid default exports"), does the merge preserve both framings or pick one? In my experience with .cursorrules files, the negative framing ("never do X") and positive framing ("always do Y") produce different model behavior even when they mean the same thing to a human.

The zero-cloud-dependency angle is a big deal too. Running memory through an external API defeats the purpose when you're trying to keep project context local.