DEV Community

Ravikash Gupta
Ravikash Gupta

Posted on

I built a free self-hosted memory system for Claude CLI — replaces Supermemory/Mem0/Zep

## What it does

  • Hybrid search: FTS5 full-text + vector similarity + importance + recency + access frequency
  • 4 memory types (static, dynamic, episodic, semantic)
  • 4 importance levels (critical → low)
  • Memory deduplication & auto-merge
  • Memory graph — link related memories together
  • Time-decay scoring (recent memories rank higher)
  • Auto semantic chunking for long texts
  • LLM-based fact extraction from conversations
  • Soft delete + restore
  • Export/import, bulk ops

## Claude CLI integration

  • MCP server with 8 tools (search, add, update, delete, profile, list, link, stats)
  • Claude automatically loads your profile at session start
  • Memories persist across sessions in SQLite
  • Drop-in .mcp.json — just restart Claude Code

## Also works with any AI app

  • Vercel AI SDK middleware (6 tools)
  • OpenAI middleware (auto-inject memories)
  • Framework-agnostic wrapper (Anthropic, Gemini, Ollama, anything)

## Tech

  • SQLite + FTS5 + better-sqlite3
  • 5 embedding options (local BM25, OpenAI, Ollama, hybrid)
  • Zero cloud dependencies
  • 36 tests, all passing

It's a single-folder drop-in. No Docker, no cloud, no API keys needed.

GitHub: https://github.com/blockmandev/ai-memory

Would love feedback from the community. What features would you add?

Top comments (1)

Collapse
 
nedcodes profile image
Ned C

The hybrid search approach is smart. Most memory systems I've seen go all-in on either vector similarity or keyword search, and both have blind spots on their own. FTS5 for exact recall + vectors for semantic similarity covers the gaps nicely.

Curious about the deduplication logic. When two memories are semantically similar but phrased differently (like "use named exports" vs "avoid default exports"), does the merge preserve both framings or pick one? In my experience with .cursorrules files, the negative framing ("never do X") and positive framing ("always do Y") produce different model behavior even when they mean the same thing to a human.

The zero-cloud-dependency angle is a big deal too. Running memory through an external API defeats the purpose when you're trying to keep project context local.