DEV Community

ClawSetup
ClawSetup

Posted on • Originally published at clawsetup.co.uk

Hybrid Local Memory in OpenClaw (SetupClaw Basic Setup): BM25 + Vectors + sqlite-vec + Local Embeddings

OpenClaw “memory” isn’t a mysterious model feature—it’s a set of Markdown files you own (MEMORY.md plus daily notes in memory/YYYY-MM-DD.md) plus a retrieval layer that helps the agent find the right snippet at the right time. SetupClaw’s Basic Setup configures this so recall stays reliable as your notes grow: exact keyword matches (config keys, IDs, error strings) still work, semantic matches still work, and—where feasible—embeddings can be generated locally on your VPS.

Hybrid Local Memory in OpenClaw (SetupClaw Basic Setup): BM25 + Vectors + sqlite-vec + Local Embeddings

What “memory” means in OpenClaw (source of truth matters)

In SetupClaw terms, the biggest operational win is that OpenClaw’s long-term memory is plain text you can read, edit, and back up:

  • MEMORY.md is curated “long-term” context (preferences, decisions, stable runbooks)
  • memory/YYYY-MM-DD.md is daily scratchpad/logs

That means the agent can reboot, models can change, and context windows can reset—your memory still exists as normal files. The retrieval/index is an acceleration layer, not the source of truth.

Why “hybrid” retrieval (BM25 + vector) beats “vector-only” for real ops work

In home-lab / VPS operations, a lot of the queries you want to answer are not poetic—they’re literal:

  • “Where did we set memorySearch.provider?”
  • “What did we decide about PR-only patterns?”
  • “What’s the Hetzner firewall rule name?”
  • “What was that cron job id?”

These queries contain exact tokens. Vector similarity is great for paraphrases (“how do we harden SSH?”), but it can be weaker at “find the line containing this exact key.” That’s why OpenClaw uses two signals:

  • BM25 (keyword relevance) for exact tokens and sparse terms
  • Vectors (semantic similarity) for paraphrases and fuzzy matches

The blending model (concrete detail)

OpenClaw converts BM25 “rank” into a score and blends it with the vector score.

  • BM25 rank is turned into a normalized score:
textScore = 1 / (1 + max(0, bm25Rank))
Enter fullscreen mode Exit fullscreen mode
  • Then a weighted blend is computed:
finalScore = vectorWeight * vectorScore + textWeight * textScore
Enter fullscreen mode Exit fullscreen mode

Practical implication: a strong keyword hit (low BM25 rank) can outrank a weaker semantic match, which is exactly what you want when you’re looking for a config key, filename, or error string.

Where sqlite-vec fits (and why SetupClaw mentions it)

Vector search can be done in a few ways. The sqlite-vec extension lets OpenClaw store embeddings in SQLite and run vector distance queries inside the database (using a vec0 virtual table), rather than loading everything into memory and filtering in JavaScript.

In Basic Setup, the goal isn’t to turn your VPS into an ML server—it’s to keep memory lookup:

  • fast enough for day-to-day chat operations
  • predictable as your notes grow
  • simple to back up (SQLite + Markdown)

If sqlite-vec isn’t available on a given box, OpenClaw can fall back without hard failure; SetupClaw documents which mode you’re running.

Local embeddings: what “local” means (privacy + cost control)

“Local embeddings” means the text chunks used for indexing are embedded on your VPS, rather than being sent to a third-party API.

In practice, SetupClaw can configure:

  • memorySearch.provider = "local" (where feasible)
  • a local embedding runtime (via node-llama-cpp), with the embedding model downloaded to the server

This reduces external dependencies and makes costs more predictable. The trade-off is that local embeddings use CPU/RAM/disk; the handoff should include sizing expectations and what to do if you later choose a remote provider.

Practical setup & verification steps (what to actually run)

These commands are the “does it work end-to-end?” checks that matter in a handoff.

1) Confirm the memory subsystem and embedding mode:

openclaw memory status --deep
Enter fullscreen mode Exit fullscreen mode

2) Force an index build (useful after big edits or first setup):

openclaw memory index
Enter fullscreen mode Exit fullscreen mode

3) Validate hybrid retrieval with both keyword and semantic queries:

openclaw memory search "memorySearch.provider"
openclaw memory search "how do we restrict telegram groups"
openclaw memory search "fail2ban sshd"
Enter fullscreen mode Exit fullscreen mode

4) Make sure memory remains “file-first” operationally:

  • edit MEMORY.md with your normal editor
  • re-run openclaw memory search "<a line you added>"
  • if you want immediate consistency, run openclaw memory index

What SetupClaw leaves behind for memory (deliverables)

A good Basic Setup handoff should include:

  • A short “Memory architecture” page: Markdown source-of-truth + hybrid retrieval
  • The chosen provider (local vs remote) + model identifier and config location
  • Where the SQLite store lives (so it can be backed up or safely rebuilt)
  • A tiny “How to verify memory” section with:
    • openclaw memory status --deep
    • openclaw memory index
    • openclaw memory search "…"

Top comments (0)