A factual comparison of the five most-referenced AI agent memory systems on architecture, LoCoMo benchmark scores, and EU AI Act compliance.
Why This Post Exists
Every comparison post I've read either reads like marketing for one system, or compares them on features without benchmark data. This is different: I'm the author of one of the systems (SuperLocalMemory), which means I have strong incentive to be honest — I can't be credibly wrong about the others without undermining my own credibility.
All scores are from published papers or official documentation. I've noted where scores vary across sources.
The Five Systems
| System | Architecture | Creator | License | Status |
|---|---|---|---|---|
| Mem0 | Cloud-hosted | Mem0 AI ($24M funded) | Open core | Production |
| Zep | Cloud-hosted + self-host | Getzep | Apache 2.0 + Commercial | Production |
| Letta (MemGPT) | Agent framework + LLM memory | Letta AI | Apache 2.0 | Production |
| Supermemory | Cloud-hosted | Open source project | MIT | Production |
| SuperLocalMemory | Local-first mathematical | Independent research | MIT | Production |
LoCoMo Benchmark Results
The LoCoMo benchmark (Long Conversation Memory) is the most widely cited evaluation for this space — 81 question-answer pairs across long multi-session conversations.
| System | Score | Cloud LLM Required | Open Source |
|---|---|---|---|
| EverMemOS | 92.3% | Yes | No |
| MemMachine | 91.7% | Yes | No |
| Hindsight | 89.6% | Yes | No |
| SLM V3 Mode C | 87.7% | Yes (synthesis) | Yes (MIT) |
| Zep | ~85% | Yes | Partial |
| Letta / MemGPT | ~83.2% | Yes | Yes (Apache) |
| SLM V3 Mode A | 74.8% | No | Yes (MIT) |
| Supermemory | ~70%* | Yes | Yes (MIT) |
| Mem0 (self-reported) | ~66% | Yes | Partial |
| SLM V3 Zero-LLM | 60.4% | No LLM at all | Yes (MIT) |
| Mem0 (independent) | ~58% | Yes | Partial |
*Supermemory score estimated from limited published data.
Key takeaway: Every system requiring cloud LLMs clusters between 83-92%. SuperLocalMemory Mode A achieves 74.8% with zero cloud dependency — demonstrating that mathematical retrieval captures most of the benchmark value without cloud compute. Mode C reaches 87.7%, competitive with the top tier.
Architecture Comparison
Mem0
- Model: Cloud-first, API-based. Memories stored on Mem0's servers.
- Retrieval: Vector similarity over cloud embeddings (typically OpenAI).
- Best for: Teams needing shared memory, managed infrastructure, cross-device access.
- Limitation: Data sovereignty, offline use, EU AI Act compliance require additional work.
Zep
- Model: Temporal knowledge graph hosted in cloud (or self-hosted Community Edition).
- Retrieval: Graph-based temporal reasoning + semantic similarity.
- Best for: Complex agent workflows requiring temporal entity relationships.
- Limitation: Self-hosting requires infrastructure management; cloud version has same data locality issues as Mem0.
Letta (MemGPT)
- Model: OS-inspired agent framework. LLM manages memory tiers (core context, recall, archival).
- Retrieval: LLM-driven — the model decides what to retrieve and when.
- Best for: Building agents where memory management logic needs to be customizable by the LLM.
- Limitation: Requires LLM for all memory operations. Memory decisions inherit LLM opacity.
Supermemory
- Model: Cloud-hosted with importable sources (tweets, web pages, documents).
- Retrieval: Vector similarity + semantic search.
- Best for: Personal knowledge management with multi-source ingestion.
- Limitation: Cloud dependency; primarily designed for personal knowledge, not agent memory.
SuperLocalMemory V3
- Model: Local-first with three mathematical retrieval layers.
- Retrieval: 4-channel RRF fusion: Fisher-Rao geometric + BM25 lexical + entity graph + temporal.
- Best for: Privacy-required workloads, EU AI Act compliance, individual developer memory, zero-cloud operation.
- Limitation: Single-device by default; no native team sharing.
EU AI Act Compliance (Takes Effect August 2, 2026)
This dimension is increasingly important for enterprise deployment in the EU.
| System | Mode A Compliance | Notes |
|---|---|---|
| SuperLocalMemory Mode A | ✅ By architecture | Data never leaves device. Zero cloud calls. |
| All others | ❌ Requires work | DPA required. Data sent to cloud providers. |
SuperLocalMemory is the only system in this table that claims compliance-by-architecture. All others can achieve compliance through additional legal and technical measures, but require active work.
The Right Tool for the Job
None of these systems is "best." The right choice depends on your requirements:
Need team memory? → Mem0 or Zep. Both are designed for shared memory.
Need LLM to manage memory logic? → Letta. It's designed for LLM-driven memory management.
Need data sovereignty or EU AI Act compliance? → SuperLocalMemory Mode A. Only local-first provides this by architecture.
Need the highest benchmark score? → None of the open systems. EverMemOS/MemMachine/Hindsight score higher, but aren't open source.
Need open source + high score? → SuperLocalMemory Mode C (87.7%) or Letta (~83.2%).
Need zero cloud costs forever? → SuperLocalMemory Mode A. No API costs, no subscription.
My System (Full Disclosure)
I'm the author of SuperLocalMemory V3. I've tried to be factually accurate about all five systems. If I've gotten something wrong, open an issue on the repo or comment below.
Paper: arXiv:2603.14588
Code: github.com/qualixar/superlocalmemory
Website: superlocalmemory.com
Varun Pratap Bhardwaj — Independent Researcher
A Qualixar Research Initiative
Top comments (0)