DEV Community

Zayd Mulani
Zayd Mulani

Posted on

I built a local-first AI memory layer for LLMs in Rust (no cloud, no API keys)

Every LLM app has the same problem — the model forgets everything between
conversations. Cloud solutions like Mem0 exist but they send your data
to their servers. I built mnemo to solve this locally.

What it does

mnemo runs as a sidecar process next to your app. You POST text to it,
it extracts named entities and relationships using a local LLM (Ollama),
builds a persistent knowledge graph, and injects relevant context back
into your prompts automatically.

The stack

  • Rust — core engine, 4 crates (mnemo-core, mnemo-api, mnemo-cli, mnemo-bench)
  • SQLite + WAL mode — persistent storage, survives restarts
  • petgraph — in-memory knowledge graph with BFS traversal
  • Axum — REST API sidecar any app can call
  • Ollama — fully local LLM, zero API costs

Fully free by default

docker compose up -d
docker exec mnemo-ollama ollama pull llama3
curl http://localhost:8080/health
Enter fullscreen mode Exit fullscreen mode

Works with OpenAI or Anthropic too if you bring your own key.

Python SDK

from mnemo import MnemoClient

client = MnemoClient()
client.ingest("I'm building a Rust vector database called vecdb")
print(client.get_context("what am I working on?"))
Enter fullscreen mode Exit fullscreen mode

Numbers

  • 122 Rust tests, 21 Python SDK tests
  • Sub-millisecond entity lookup
  • ~4ms full retrieval pipeline (debug build)

Links

GitHub: https://github.com/zaydmulani09/mnemo

Would love feedback, especially on the retrieval scoring and graph
traversal approach.

Top comments (0)