If you've used OpenClaw before, you know the feeling all too well: once a conversation ends, all the API keys, technical decisions, and project background from previous chats seem to be wiped away by an eraser. This isn't a bug in OpenClaw—it's a common dilemma faced by all LLM Agents: limited context windows, and complete memory loss when the session ends.
The common solution in the community is memory plugins like OpenViking, but have you ever wondered: Is there a solution that can remember more while also saving a huge amount of Token costs?
The answer is a resounding yes. Cortex Memory has burst onto the scene, scoring the highest at 68.42% in the official LoCoMo benchmark (surpassing OpenViking's 52.08%), while reducing Token consumption by 11 times compared to OpenClaw+LanceDB, and improving score efficiency per thousand Tokens by 18 times.
This isn't magic—it's the power of architecture. Let's take a closer look.
Why Does OpenClaw Need "External Memory"?
If you're a heavy user of OpenClaw, these scenarios must be familiar:
Scenario 1: Repeatedly asking for API keys
User: Call Alibaba Cloud OSS to upload a file
Agent: What is your AccessKey?
User: xxx
(Next day, new session)
User: Upload another file for me
Agent: What is your AccessKey?
User: (frustrated) Didn't I tell you yesterday...
Scenario 2: "Amnesia" after long conversations
User: My project goal is to build a B2B sales tool
(After 50 rounds of conversation, discussing various technical details)
User: Based on my goal mentioned earlier, help me design the core architecture
Agent: What was the goal you mentioned earlier?
Scenario 3: Repeating the same mistakes
User: Call sales-db-query skill, incorrect parameter format
Agent: (error)
User: Correct format is {...}
(New session)
User: Call this skill again
Agent: (same error again)
The root cause of these issues: OpenClaw's native memory system has "goldfish memory"—once the context window is full, earlier content gets squeezed out; when the session ends, all states reset to zero.
OpenViking's Solution vs Cortex Memory's Overwhelming Advantage
OpenViking indeed solves this problem by giving Agents long-term memory through "virtual file system + vector search". But Cortex Memory delivers a crushing blow on top of this:
Benchmark data speaks for itself
| System | LoCoMo Benchmark Score | Avg Tokens/Question | Score per 1K Tokens |
|---|---|---|---|
| Cortex Memory v5 | 68.42% | ~2,900 | 23.6 |
| OpenViking + OpenClaw | 52.08% | ~2,769 | 18.8 |
| OpenClaw + LanceDB | 44.55% | ~33,490 | 1.3 |
| OpenClaw Native Memory | 35.65% | ~15,982 | 2.2 |
Key insight: Cortex Memory not only achieves the highest score but also dominates in Token efficiency—compared to OpenClaw+LanceDB, Token savings reach 91%, with efficiency improved by 18 times.
Why Can Cortex Memory Achieve This?
The secret lies in its three-layer memory architecture:
The problem with traditional approaches: either load everything (Token explosion) or only store summaries (loss of details).
Cortex Memory's solution: Progressive layered retrieval—first quickly filter with 100-Token summaries, then refine with 2,000-Token overviews, and finally load only the truly needed full content.
Result: Retrieving 100 memories, traditional approaches need to load 100 × full content; Cortex Memory only needs 100 × 100 Tokens (L0 layer) plus a small number of L1/L2 layers.
MemClaw: One-Click Upgrade for OpenClaw
Cortex Memory provides an out-of-the-box OpenClaw plugin—MemClaw.
Installation with just one command
openclaw plugins install @memclaw/memclaw
Extremely simple configuration
Add to openclaw.json:
{
"plugins": {
"entries": {
"memclaw": {
"enabled": true,
"config": {
"llmApiKey": "your-api-key",
"embeddingApiKey": "your-api-key"
}
}
}
},
"agents": {
"defaults": {
"memorySearch": { "enabled": false } // Disable native memory
}
}
}
Core tools at a glance
| Tool | Purpose |
|---|---|
cortex_search |
Layered semantic search, controllable return layer |
cortex_recall |
Retrieve memories with full context |
cortex_add_memory |
Store important information for future retrieval |
cortex_commit_session |
Commit session and trigger memory extraction |
cortex_migrate |
One-click migration of OpenClaw native memories |
cortex_maintenance |
Regular maintenance (cleanup, rebuild index) |
Real-World Results: From "Forgetful" to "Elephant Memory"
Case 1: Skill invocation experience accumulation
Problem: Calling a certain skill always results in parameter errors; need to retry from scratch in every new session.
MemClaw solution:
User: Call sales-db-query skill, query national sales data
Agent: (call succeeds, MemClaw automatically records correct parameter format)
(Three days later, new session)
User: Query East China region data again
Agent: (MemClaw retrieves previous successful case, uses correct format directly)
Case 2: Long conversation goals don't get lost
Problem: After 50 rounds of conversation, the Agent forgets the originally set project goal.
MemClaw solution:
User: My project goal is to build a B2B sales tool, focusing on stability
(After 100 rounds of conversation)
User: Based on my earlier goal, design the core architecture
Agent: (MemClaw retrieves original goal, outputs solution that meets constraints)
Case 3: Cross-session memory reuse
Problem: Need to re-enter API keys in every new session.
MemClaw solution:
Session A: User enters OSS key, MemClaw stores to cortex://user/preferences/
Session B: User requests file upload, MemClaw automatically retrieves key, no need to repeat input
Technical Highlights: Why Choose Cortex Memory?
1. Rust Implementation, Maximum Performance
Compared to Node.js-based memory solutions, Cortex Memory is written in Rust:
- Memory safety: No GC pauses, no memory leaks
- Concurrent processing: Tokio async runtime, excellent performance under high concurrency
- Low resource usage: 60%+ less memory under same load
2. Fully Localized, Data Privacy
- All memories stored locally in
cortex-data/directory - Vector database uses local Qdrant (or remote)
- Zero cloud dependencies, suitable for sensitive data scenarios
3. Multi-Tenant Isolation
Supports multiple isolated memory spaces:
cortex-data/
├── tenants/
│ ├── project-a/ # Memories for Project A
│ ├── project-b/ # Memories for Project B
│ └── personal/ # Personal memories
4. Rich Interface
-
CLI tool:
cortex-memcommand-line management -
REST API:
/api/v2/*endpoints - MCP Protocol: Supports Claude Desktop, Cursor, etc.
- Web Dashboard: Svelte 5 visual management
Quick Start
1. Install dependencies
# Install Qdrant vector database
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant
# Install MemClaw plugin
openclaw plugins install @memclaw/memclaw
2. Configure API keys
{
"plugins": {
"entries": {
"memclaw": {
"enabled": true,
"config": {
"llmApiBaseUrl": "https://api.openai.com/v1",
"llmApiKey": "sk-xxx",
"llmModel": "gpt-5-mini",
"embeddingApiBaseUrl": "https://api.openai.com/v1",
"embeddingApiKey": "sk-xxx",
"embeddingModel": "text-embedding-3-small"
}
}
}
}
}
3. Start using
Restart OpenClaw Gateway, MemClaw will automatically start background services. Now your Agent possesses a "super brain".
Final Thoughts
Cortex Memory is not just a simple memory storage—it's AI Agent's cognitive infrastructure.
It solves the paradox of "memory precision vs Token cost" with its three-layer architecture, guarantees performance and stability with Rust implementation, and achieves seamless integration with OpenClaw through the MemClaw plugin.
If you're tired of Agents' "goldfish memory", if you're心疼 Token consumption, if you need a production-grade long-term memory solution—Cortex Memory is worth trying.
Project URL: https://github.com/sopaco/cortex-mem
MemClaw Plugin: examples/@memclaw/plugin
Full Documentation: litho.docs/zh





Top comments (0)