Every LLM app has the same problem — the model forgets everything between
conversations. Cloud solutions like Mem0 exist but they send your data
to their servers. I built mnemo to solve this locally.
What it does
mnemo runs as a sidecar process next to your app. You POST text to it,
it extracts named entities and relationships using a local LLM (Ollama),
builds a persistent knowledge graph, and injects relevant context back
into your prompts automatically.
The stack
- Rust — core engine, 4 crates (mnemo-core, mnemo-api, mnemo-cli, mnemo-bench)
- SQLite + WAL mode — persistent storage, survives restarts
- petgraph — in-memory knowledge graph with BFS traversal
- Axum — REST API sidecar any app can call
- Ollama — fully local LLM, zero API costs
Fully free by default
docker compose up -d
docker exec mnemo-ollama ollama pull llama3
curl http://localhost:8080/health
Works with OpenAI or Anthropic too if you bring your own key.
Python SDK
from mnemo import MnemoClient
client = MnemoClient()
client.ingest("I'm building a Rust vector database called vecdb")
print(client.get_context("what am I working on?"))
Numbers
- 122 Rust tests, 21 Python SDK tests
- Sub-millisecond entity lookup
- ~4ms full retrieval pipeline (debug build)
Links
GitHub: https://github.com/zaydmulani09/mnemo
Would love feedback, especially on the retrieval scoring and graph
traversal approach.
Top comments (0)