Every AI agent memory system I've used (Mem0, Honcho, Hindsight) has the same problem: they accumulate forever. Old facts pollute retrieval. More tokens → worse results. Your agent gets slower and dumber over time.
So I built recall-sqlite — a memory system that actually forgets.
The core idea: tiered storage. Memories are automatically promoted or demoted based on how often they're accessed.
- Hot tier (~500): ANN + keywords + FTS5 — fast full retrieval
- Warm tier (~5K): Keywords + FTS5 only — 66-99% less compute
- Cold tier (unlimited): Zero compute, auto-promoted when relevant
Key design decisions:
- Zero LLM at query time (embedding model only, 150MB local)
- No vector database (just SQLite + sqlite-vec)
- Graceful degradation (keyword+FTS5 fallback when offline)
- Automatic schema migration (no manual steps)
- Single pip install, no API keys, no Docker
After 6 months of daily use with 1469 memories, latency stays at ~80ms and memory is fixed at ~1.5MB.
Just got PR'd into the Hermes Agent ecosystem.
pip install recall-sqlite
github.com/Jnocode/recall-memory
Top comments (0)