Hermes has a lot of memory options. If you're new, the choices can be overwhelming — built-in memory, 8 external providers, different costs, different architectures. This guide breaks it all down so you can make the right call for your setup.
First: Built-In Memory (Always Active)
Before we talk providers, understand that built-in memory is always on. It doesn't cost anything, requires no setup, and works out of the box.
Two files in ~/.hermes/memories/:
| File | Purpose | Char Limit |
|---|---|---|
| MEMORY.md | Agent's notes — environment facts, project conventions, lessons learned | 2,200 chars (~800 tokens) |
| USER.md | User profile — your name, preferences, communication style | 1,375 chars (~500 tokens) |
Both are injected into the system prompt at the start of every session. The agent manages them automatically — it saves preferences you correct, environment facts it discovers, and conventions it learns.
Key details:
- Entries are separated by
§delimiters - The header shows usage % (e.g.
MEMORY [67% — 1,474/2,200 chars]) - Above 80% capacity, the agent should consolidate before adding
- Duplicate entries are auto-rejected
- Entries are scanned for injection/exfiltration patterns for security
- Changes persist to disk immediately but appear in the system prompt at the next session (frozen snapshot — preserves LLM prefix cache)
For most new users, built-in memory is enough. It handles preferences, project facts, and daily workflow notes. You don't need an external provider for a personal assistant setup.
But you'll want one when:
- You have multiple Hermes profiles that should share knowledge
- You want the agent to learn and synthesize across sessions automatically
- You're running long conversations that exceed context limits
- You need structured knowledge retrieval (entities, relationships, not just text blobs)
The 8 External Memory Providers
All external providers are installed via:
hermes memory setup # interactive picker
hermes memory status # check what's active
hermes memory off # disable
Or set manually in ~/.hermes/config.yaml:
memory:
provider: hindsight # or any of the 8
Important: Only one external provider can be active at a time. All of them layer on top of built-in memory — they don't replace it.
Quick Comparison
| Provider | Storage | Cost | Unique Angle | Best For |
|---|---|---|---|---|
| Hindsight | Local/Cloud | Free (local) | Knowledge graph + reflect synthesis | Highest accuracy, privacy |
| Holographic | Local SQLite | Free | HRR algebra + trust scoring, zero deps | Air-gapped, zero-install |
| OpenViking | Self-hosted | Free (AGPL) | Tiered L0/L1/L2 loading, 80-90% token savings | Self-hosted teams, cost optimization |
| Mem0 | Cloud | Freemium | Server-side LLM extraction, dual memory scope | Fastest setup |
| Honcho | Cloud/Self | Paid (cloud) / Free (self-hosted) | Dialectic user modeling | Multi-agent, deep user understanding |
| ByteRover | Local/Cloud | Freemium | Knowledge tree in human-readable Markdown | Pre-compression knowledge capture |
| RetainDB | Cloud | Paid | Hybrid search: vector + BM25 + reranking | Production search quality |
| SuperMemory | Cloud | — | Web-focused memory with browser integration | Web research workflows |
Benchmark Snapshot
Only two providers have published LongMemEval scores:
| Provider | Score | Model |
|---|---|---|
| Hindsight | 91.4% | Gemini-3 |
| Hindsight | 89.0% | Open-source 120B |
| Mem0 | 67.6% | GPT-4o (LongMemEval-S variant) |
Hindsight is the clear retrieval accuracy leader. Others haven't published comparable benchmarks.
Provider Deep Dives
🥇 Hindsight
The best all-around choice for most users who want local + accurate.
Stores structured knowledge — discrete facts, named entities, and relationships — not raw text chunks. Its unique hindsight_reflect tool periodically synthesizes higher-level insights across all memories. Think of it as the agent building a personal knowledge graph over time.
Setup: hermes memory setup → select Hindsight
Leave blank for local daemon, or set HINDSIGHT_API_KEY for cloud
Tools: hindsight_recall, hindsight_retain, hindsight_reflect
Cost: Free (local PostgreSQL daemon) / Cloud available for teams
Best if: You want the highest retrieval accuracy, need structured knowledge, or handle privacy-sensitive data.
Holographic
Zero dependencies. Nothing leaves your machine. Literally two tools and done.
Uses Holographic Reduced Representations (HRR) — memories stored as superposed complex-valued vectors. Recall is algebraic, not similarity-based. A trust-scoring mechanism causes confirmed memories to gain weight and contradicted ones to decay over time.
Setup: hermes memory setup → select Holographic. That's it. No API keys.
Tools: 2 tools (minimal by design)
Cost: Free. Local SQLite. Period.
Best if: You're in an air-gapped environment, hate external dependencies, or want self-correcting memory that learns what's trustable.
OpenViking
The token-saver. Self-hosted context database from ByteDance.
Its filesystem-style hierarchy with tiered loading is the standout feature:
- L0 (Abstract): ~100 tokens — loaded every turn
- L1 (Overview): ~2k tokens — loaded when planning
- L2 (Full): Complete content — loaded only when deep context needed
This means 80-90% token cost reduction vs. loading full context every turn. Auto-extracts memories into 6 categories: profile, preferences, entities, events, cases, patterns.
Setup: pip install openviking
openviking-server
hermes memory setup → select OpenViking
Set OPENVIKING_ENDPOINT=http://localhost:1933
Tools: viking_search, viking_read, viking_browse, viking_remember, viking_add_resource
Cost: Free (AGPL-3.0, self-hosted)
Best if: You're running at scale, want self-hosted infrastructure, or need to minimize token costs.
Mem0
The "just make it work" option. 30 seconds to running.
Server-side LLM extraction means Mem0's infrastructure decides what to keep. Includes a circuit breaker so memory failures don't block agent responses. Dual memory scope (session + user) means it separates short-term context from long-term facts.
Setup: hermes memory setup → select Mem0
Set MEM0_API_KEY=your-key
Tools: mem0_add, mem0_search, mem0_get_all
Cost: Freemium (free tier available)
Best if: You want the fastest setup, don't want to self-host, and are okay with cloud storage. Good starting point — you can always migrate later.
Honcho
The philosopher. Builds a model of how you think, not just what you know.
Dialectic user modeling captures reasoning patterns, communication style, and decision-making tendencies over time. Two-layer context injection with configurable cadences for refreshes. Supports multi-agent setups with separate AI peers per Hermes profile.
Setup: hermes memory setup → select Honcho
Set HONCHO_API_KEY=your-key
Tools: honcho_profile, honcho_search, honcho_context, honcho_reasoning, honcho_conclude
Cost: Paid (cloud) / Free (self-hosted, AGPL-3.0)
⚠️ Licensing note: OSS is AGPL v3.0. Self-hosting in a networked app requires releasing your source under AGPL. Using managed cloud avoids this.
Best if: You're building a personal assistant that should deepen its model of you over time, or running multi-agent systems with shared user context.
ByteRover
Your knowledge, stored as readable Markdown. No black boxes.
Hierarchical knowledge tree stored in .brv/context-tree/ as human-readable Markdown files. Unique pre-compression extraction hook fires before Hermes compresses long conversations, capturing knowledge before context gets summarized away.
Setup: hermes memory setup → select ByteRover
Tools: byterover_search, byterover_list, byterover_forget
Cost: Freemium
Best if: You want full visibility into stored memory, or need to capture knowledge from long conversations before compression loses it.
RetainDB
Search nerd's pick. Hybrid vector + BM25 + reranking.
Combines multiple retrieval strategies for the highest-quality search results. Vector similarity catches semantic matches, BM25 catches exact keyword matches, and reranking puts the best results on top.
Setup: hermes memory setup → select RetainDB
Tools: retaindb_search, retaindb_store
Cost: Paid
Best if: Retrieval quality is your top priority and you're willing to pay for it.
SuperMemory
Web research workflows. Browser-integrated memory.
Designed for memory that extends into the browser — captures and retrieves web content as part of your knowledge base.
Setup: hermes memory setup → select SuperMemory
Cost: See supermemory.ai pricing
Best if: Your workflow involves heavy web research and you want persistent memory of online content.
Cost Summary
| Tier | Providers | Notes |
|---|---|---|
| Free, local | Holographic, Hindsight (local), OpenViking | No API keys, no cloud. Holographic is the easiest pick. |
| Free tier / freemium | Mem0, ByteRover | Start free, pay for higher limits |
| Paid cloud | Honcho, RetainDB, SuperMemory | Production features, team support |
| Always free (built-in) | MEMORY.md + USER.md | No setup, always active, 2200 + 1375 char limits |
My Recommendations
Just getting started?
Stick with built-in memory. It covers 80% of use cases. Add an external provider only when you hit its limits.
Want the best free local experience?
Hindsight (local daemon). Best benchmarks, nothing leaves your machine, structured knowledge graph.
Want zero config?
Hogrpghic. Pick it in hermes memory setup and you're done. No API keys, no servers.
Want the easiest cloud setup?
Mem0. 30 seconds, free tier, hands-off extraction.
Running multi-agent or want deep user modeling?
Honcho. The dialectic reasoning is genuinely different from every other provider.
Care about token costs at scale?
OpenViking's tiered loading will save you 80-90% on tokens.
Migrating Between Providers
Switching is straightforward:
hermes memory setup # pick new provider
hermes memory status # confirm it's active
Your built-in memory (MEMORY.md, USER.md) stays intact regardless of which external provider you use. Note that external providers store data in their own backends — switching providers means starting fresh with the new one's knowledge base. There's no automated migration between providers yet.
Questions?
Drop them in the comments. I'm happy to help you pick the right setup for your use case.
Top comments (0)