DEV Community

Cover image for Cortex Memory: Give OpenClaw a 'Super Brain', Token Cost Slashed by 91%
Sopaco
Sopaco

Posted on

Cortex Memory: Give OpenClaw a 'Super Brain', Token Cost Slashed by 91%

If you've used OpenClaw before, you know the feeling all too well: once a conversation ends, all the API keys, technical decisions, and project background from previous chats seem to be wiped away by an eraser. This isn't a bug in OpenClaw—it's a common dilemma faced by all LLM Agents: limited context windows, and complete memory loss when the session ends.

The common solution in the community is memory plugins like OpenViking, but have you ever wondered: Is there a solution that can remember more while also saving a huge amount of Token costs?

The answer is a resounding yes. Cortex Memory has burst onto the scene, scoring the highest at 68.42% in the official LoCoMo benchmark (surpassing OpenViking's 52.08%), while reducing Token consumption by 11 times compared to OpenClaw+LanceDB, and improving score efficiency per thousand Tokens by 18 times.

This isn't magic—it's the power of architecture. Let's take a closer look.


Why Does OpenClaw Need "External Memory"?

If you're a heavy user of OpenClaw, these scenarios must be familiar:

Scenario 1: Repeatedly asking for API keys

User: Call Alibaba Cloud OSS to upload a file
Agent: What is your AccessKey?
User: xxx
(Next day, new session)
User: Upload another file for me
Agent: What is your AccessKey?
User: (frustrated) Didn't I tell you yesterday...
Enter fullscreen mode Exit fullscreen mode

Scenario 2: "Amnesia" after long conversations

User: My project goal is to build a B2B sales tool
(After 50 rounds of conversation, discussing various technical details)
User: Based on my goal mentioned earlier, help me design the core architecture
Agent: What was the goal you mentioned earlier?
Enter fullscreen mode Exit fullscreen mode

Scenario 3: Repeating the same mistakes

User: Call sales-db-query skill, incorrect parameter format
Agent: (error)
User: Correct format is {...}
(New session)
User: Call this skill again
Agent: (same error again)
Enter fullscreen mode Exit fullscreen mode

The root cause of these issues: OpenClaw's native memory system has "goldfish memory"—once the context window is full, earlier content gets squeezed out; when the session ends, all states reset to zero.


OpenViking's Solution vs Cortex Memory's Overwhelming Advantage

OpenViking indeed solves this problem by giving Agents long-term memory through "virtual file system + vector search". But Cortex Memory delivers a crushing blow on top of this:

Benchmark data speaks for itself

System LoCoMo Benchmark Score Avg Tokens/Question Score per 1K Tokens
Cortex Memory v5 68.42% ~2,900 23.6
OpenViking + OpenClaw 52.08% ~2,769 18.8
OpenClaw + LanceDB 44.55% ~33,490 1.3
OpenClaw Native Memory 35.65% ~15,982 2.2

Key insight: Cortex Memory not only achieves the highest score but also dominates in Token efficiency—compared to OpenClaw+LanceDB, Token savings reach 91%, with efficiency improved by 18 times.

Why Can Cortex Memory Achieve This?

The secret lies in its three-layer memory architecture:

The problem with traditional approaches: either load everything (Token explosion) or only store summaries (loss of details).

Cortex Memory's solution: Progressive layered retrieval—first quickly filter with 100-Token summaries, then refine with 2,000-Token overviews, and finally load only the truly needed full content.

Result: Retrieving 100 memories, traditional approaches need to load 100 × full content; Cortex Memory only needs 100 × 100 Tokens (L0 layer) plus a small number of L1/L2 layers.


MemClaw: One-Click Upgrade for OpenClaw

Cortex Memory provides an out-of-the-box OpenClaw plugin—MemClaw.

Installation with just one command

openclaw plugins install @memclaw/memclaw
Enter fullscreen mode Exit fullscreen mode

Extremely simple configuration

Add to openclaw.json:

{
  "plugins": {
    "entries": {
      "memclaw": {
        "enabled": true,
        "config": {
          "llmApiKey": "your-api-key",
          "embeddingApiKey": "your-api-key"
        }
      }
    }
  },
  "agents": {
    "defaults": {
      "memorySearch": { "enabled": false }  // Disable native memory
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Core tools at a glance

Tool Purpose
cortex_search Layered semantic search, controllable return layer
cortex_recall Retrieve memories with full context
cortex_add_memory Store important information for future retrieval
cortex_commit_session Commit session and trigger memory extraction
cortex_migrate One-click migration of OpenClaw native memories
cortex_maintenance Regular maintenance (cleanup, rebuild index)

Real-World Results: From "Forgetful" to "Elephant Memory"

Case 1: Skill invocation experience accumulation

Problem: Calling a certain skill always results in parameter errors; need to retry from scratch in every new session.

MemClaw solution:

User: Call sales-db-query skill, query national sales data
Agent: (call succeeds, MemClaw automatically records correct parameter format)
(Three days later, new session)
User: Query East China region data again
Agent: (MemClaw retrieves previous successful case, uses correct format directly)
Enter fullscreen mode Exit fullscreen mode

Case 2: Long conversation goals don't get lost

Problem: After 50 rounds of conversation, the Agent forgets the originally set project goal.

MemClaw solution:

User: My project goal is to build a B2B sales tool, focusing on stability
(After 100 rounds of conversation)
User: Based on my earlier goal, design the core architecture
Agent: (MemClaw retrieves original goal, outputs solution that meets constraints)
Enter fullscreen mode Exit fullscreen mode

Case 3: Cross-session memory reuse

Problem: Need to re-enter API keys in every new session.

MemClaw solution:

Session A: User enters OSS key, MemClaw stores to cortex://user/preferences/
Session B: User requests file upload, MemClaw automatically retrieves key, no need to repeat input
Enter fullscreen mode Exit fullscreen mode

Technical Highlights: Why Choose Cortex Memory?

1. Rust Implementation, Maximum Performance

Compared to Node.js-based memory solutions, Cortex Memory is written in Rust:

  • Memory safety: No GC pauses, no memory leaks
  • Concurrent processing: Tokio async runtime, excellent performance under high concurrency
  • Low resource usage: 60%+ less memory under same load

2. Fully Localized, Data Privacy

  • All memories stored locally in cortex-data/ directory
  • Vector database uses local Qdrant (or remote)
  • Zero cloud dependencies, suitable for sensitive data scenarios

3. Multi-Tenant Isolation

Supports multiple isolated memory spaces:

cortex-data/
├── tenants/
│   ├── project-a/     # Memories for Project A
│   ├── project-b/     # Memories for Project B
│   └── personal/      # Personal memories
Enter fullscreen mode Exit fullscreen mode

4. Rich Interface

  • CLI tool: cortex-mem command-line management
  • REST API: /api/v2/* endpoints
  • MCP Protocol: Supports Claude Desktop, Cursor, etc.
  • Web Dashboard: Svelte 5 visual management

Quick Start

1. Install dependencies

# Install Qdrant vector database
docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant

# Install MemClaw plugin
openclaw plugins install @memclaw/memclaw
Enter fullscreen mode Exit fullscreen mode

2. Configure API keys

{
  "plugins": {
    "entries": {
      "memclaw": {
        "enabled": true,
        "config": {
          "llmApiBaseUrl": "https://api.openai.com/v1",
          "llmApiKey": "sk-xxx",
          "llmModel": "gpt-5-mini",
          "embeddingApiBaseUrl": "https://api.openai.com/v1",
          "embeddingApiKey": "sk-xxx",
          "embeddingModel": "text-embedding-3-small"
        }
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

3. Start using

Restart OpenClaw Gateway, MemClaw will automatically start background services. Now your Agent possesses a "super brain".


Final Thoughts

Cortex Memory is not just a simple memory storage—it's AI Agent's cognitive infrastructure.

It solves the paradox of "memory precision vs Token cost" with its three-layer architecture, guarantees performance and stability with Rust implementation, and achieves seamless integration with OpenClaw through the MemClaw plugin.

If you're tired of Agents' "goldfish memory", if you're心疼 Token consumption, if you need a production-grade long-term memory solution—Cortex Memory is worth trying.


Project URL: https://github.com/sopaco/cortex-mem

MemClaw Plugin: examples/@memclaw/plugin

Full Documentation: litho.docs/zh

Top comments (0)