DEV Community

varun pratap Bhardwaj
varun pratap Bhardwaj

Posted on

5 AI Agent Memory Systems Compared: Mem0, Zep, Letta, Supermemory, SuperLocalMemory (2026 Benchmark Data)

A factual comparison of the five most-referenced AI agent memory systems on architecture, LoCoMo benchmark scores, and EU AI Act compliance.

Why This Post Exists

Every comparison post I've read either reads like marketing for one system, or compares them on features without benchmark data. This is different: I'm the author of one of the systems (SuperLocalMemory), which means I have strong incentive to be honest — I can't be credibly wrong about the others without undermining my own credibility.

All scores are from published papers or official documentation. I've noted where scores vary across sources.


The Five Systems

System Architecture Creator License Status
Mem0 Cloud-hosted Mem0 AI ($24M funded) Open core Production
Zep Cloud-hosted + self-host Getzep Apache 2.0 + Commercial Production
Letta (MemGPT) Agent framework + LLM memory Letta AI Apache 2.0 Production
Supermemory Cloud-hosted Open source project MIT Production
SuperLocalMemory Local-first mathematical Independent research MIT Production

LoCoMo Benchmark Results

The LoCoMo benchmark (Long Conversation Memory) is the most widely cited evaluation for this space — 81 question-answer pairs across long multi-session conversations.

System Score Cloud LLM Required Open Source
EverMemOS 92.3% Yes No
MemMachine 91.7% Yes No
Hindsight 89.6% Yes No
SLM V3 Mode C 87.7% Yes (synthesis) Yes (MIT)
Zep ~85% Yes Partial
Letta / MemGPT ~83.2% Yes Yes (Apache)
SLM V3 Mode A 74.8% No Yes (MIT)
Supermemory ~70%* Yes Yes (MIT)
Mem0 (self-reported) ~66% Yes Partial
SLM V3 Zero-LLM 60.4% No LLM at all Yes (MIT)
Mem0 (independent) ~58% Yes Partial

*Supermemory score estimated from limited published data.

Key takeaway: Every system requiring cloud LLMs clusters between 83-92%. SuperLocalMemory Mode A achieves 74.8% with zero cloud dependency — demonstrating that mathematical retrieval captures most of the benchmark value without cloud compute. Mode C reaches 87.7%, competitive with the top tier.


Architecture Comparison

Mem0

  • Model: Cloud-first, API-based. Memories stored on Mem0's servers.
  • Retrieval: Vector similarity over cloud embeddings (typically OpenAI).
  • Best for: Teams needing shared memory, managed infrastructure, cross-device access.
  • Limitation: Data sovereignty, offline use, EU AI Act compliance require additional work.

Zep

  • Model: Temporal knowledge graph hosted in cloud (or self-hosted Community Edition).
  • Retrieval: Graph-based temporal reasoning + semantic similarity.
  • Best for: Complex agent workflows requiring temporal entity relationships.
  • Limitation: Self-hosting requires infrastructure management; cloud version has same data locality issues as Mem0.

Letta (MemGPT)

  • Model: OS-inspired agent framework. LLM manages memory tiers (core context, recall, archival).
  • Retrieval: LLM-driven — the model decides what to retrieve and when.
  • Best for: Building agents where memory management logic needs to be customizable by the LLM.
  • Limitation: Requires LLM for all memory operations. Memory decisions inherit LLM opacity.

Supermemory

  • Model: Cloud-hosted with importable sources (tweets, web pages, documents).
  • Retrieval: Vector similarity + semantic search.
  • Best for: Personal knowledge management with multi-source ingestion.
  • Limitation: Cloud dependency; primarily designed for personal knowledge, not agent memory.

SuperLocalMemory V3

  • Model: Local-first with three mathematical retrieval layers.
  • Retrieval: 4-channel RRF fusion: Fisher-Rao geometric + BM25 lexical + entity graph + temporal.
  • Best for: Privacy-required workloads, EU AI Act compliance, individual developer memory, zero-cloud operation.
  • Limitation: Single-device by default; no native team sharing.

EU AI Act Compliance (Takes Effect August 2, 2026)

This dimension is increasingly important for enterprise deployment in the EU.

System Mode A Compliance Notes
SuperLocalMemory Mode A ✅ By architecture Data never leaves device. Zero cloud calls.
All others ❌ Requires work DPA required. Data sent to cloud providers.

SuperLocalMemory is the only system in this table that claims compliance-by-architecture. All others can achieve compliance through additional legal and technical measures, but require active work.


The Right Tool for the Job

None of these systems is "best." The right choice depends on your requirements:

Need team memory? → Mem0 or Zep. Both are designed for shared memory.

Need LLM to manage memory logic? → Letta. It's designed for LLM-driven memory management.

Need data sovereignty or EU AI Act compliance? → SuperLocalMemory Mode A. Only local-first provides this by architecture.

Need the highest benchmark score? → None of the open systems. EverMemOS/MemMachine/Hindsight score higher, but aren't open source.

Need open source + high score? → SuperLocalMemory Mode C (87.7%) or Letta (~83.2%).

Need zero cloud costs forever? → SuperLocalMemory Mode A. No API costs, no subscription.


My System (Full Disclosure)

I'm the author of SuperLocalMemory V3. I've tried to be factually accurate about all five systems. If I've gotten something wrong, open an issue on the repo or comment below.

Paper: arXiv:2603.14588
Code: github.com/qualixar/superlocalmemory
Website: superlocalmemory.com


Varun Pratap Bhardwaj — Independent Researcher
A Qualixar Research Initiative


Top comments (0)