h-wata

Posted on Jun 5

kioku-mesh: Why I put Zenoh under my AI's long-term memory

#mcp #ai #zenoh #claudecode

This article was written with help from Claude (an AI). I reviewed and edited it before publishing.

The gap between Claude Code and the web app

If you've lived in Claude Code for a while and then go back to the web version of an AI assistant, the gap is jarring. In the web app you end up explaining everything from scratch, to the point where you start wondering whether it's really the same model underneath. As a lot of you already know, the difference is context.

Claude Code keeps its memory across sessions. But only on a single machine. The context I build up on my home box is a blank slate on my office box, and when several agents are running at once, each one "learns" on its own.

The obvious first fix is to have the agent scribble notes into SQLite. Plenty of MCP memory apps stop right there.

So why does kioku-mesh put a distributed messaging system (Zenoh) on top of that?

Why SQLite alone isn't enough

SQLite is tied to one process and one file. For agents spread across several machines reading and writing the same memory, it's just the wrong shape.

The naive workaround is to rsync the SQLite file between machines. That falls apart quickly:

Conflicts: if home and office write at the same time, there's no way to decide which one wins.
Consistency: depending on when the copy happens, one side keeps running on stale data.
Offline: lose the network and you can't sync at all, so conflicts just pile up when you reconnect.

The moment you have multiple machines and multiple agents writing concurrently, copying a SQLite file means writing your own conflict-resolution logic. I didn't want to write that.

Why Zenoh

kioku-mesh leans on Zenoh exactly so I don't have to solve that conflict problem myself.

What Zenoh is

Zenoh is an open-source distributed communication middleware from the Eclipse Foundation. It was built for IoT and robotics, and the part I care about is that it runs entirely on your own machines with no cloud service in the loop. It gives you pub/sub plus a key-value store. (A KV store is the simple kind of database that reads and writes name/value pairs. Redis and DynamoDB are the famous examples.) Zenoh wraps networked pub/sub and replication around that.

HLC (Hybrid Logical Clock) timestamps

An HLC is a timestamp that combines a physical clock (wall-clock time) with a logical clock (causal order). It came out of distributed-systems research and shows up in systems like CockroachDB.

With plain wall-clock time, every machine's clock drifts a little (NTP skew), so you can't reliably tell which write came later. An HLC absorbs that drift while still recording causality correctly, i.e. that B happened after A.

Zenoh's advantage here is that HLC is built in. You implement nothing; a timestamp is attached automatically on every write. In my own testing, causality held even when NTP skew went past 12 seconds, so conflict resolution needs no extra code.

The replication plugin's digest comparison

To keep periodic syncing cheap while two nodes are connected, Zenoh compares digests at hot/warm/cold "era" granularity. My config checks for differences every interval: 10.0 seconds and propagates anything that doesn't match. After a split-brain, re-sync converges in about 5 seconds.

An eventually-consistent KV

Writes don't have to reach every node instantly. Even on a flaky network, things sync automatically once a connection comes back. That's where the offline tolerance comes from.

Architecture: the big picture

The write path

When an agent calls save_observation, this is what happens:

store.put_observation() issues a PUT mem/obs/... to Zenoh.
zenohd's storage_manager persists it to RocksDB.
The replication plugin propagates it to the zenohd on other nodes.
On success, the calling process upserts into its own SQLite right away.

The read path

Search comes back from the local SQLite. It never asks Zenoh.

# store.search_observations() (sketch)
def search_observations(query, ...):
    idx = get_index()
    if not idx.disabled:
        return idx.search(...)          # served from SQLite
    return _search_via_zenoh(...)       # fallback: full Zenoh scan

Keeping SQLite fresh

So how does SQLite stay current? Two mechanisms.

1. Zenoh subscriber (real-time)

At process startup it subscribes to mem/obs/** and mem/tomb/**. Every time a PUT arrives from another node via replication, it upserts into SQLite immediately.

def on_obs(sample):
    obs = Observation.from_json(sample.payload.to_string())
    idx.upsert(obs)  # write straight into SQLite

sub_obs = session.declare_subscriber('mem/obs/**', on_obs)

2. Rebuild on startup (consistency)

A long-running process (the MCP server) runs rebuild_from_zenoh() once at startup, doing a full scan of Zenoh's mem/obs/** to rebuild SQLite. This picks up anything that changed while the process was down, such as replication from other nodes.

In practice that's about 0.4 seconds for 50k records. It only happens once at MCP server startup, so you don't notice it.

Performance

What sold me on SQLite was the measurements. I started out running full scans against Zenoh, but fetching was too slow, so I moved to SQLite as a cache.

Full scan over Zenoh (the old path)

Pull everything over the network, then filter in Python.

Records	Latency
16k	2.2 s (scan count dominates and ignores `limit`)
36MB store	timeout (hits the 10-second limit)

SQLite local index (the current path)

Just throw a SQL query at the local SQLite.

Records	Operation	Latency
50k	rebuild	~0.4 s
50k	query p99	~0.04 ms
50k	Tier-4, real HW	sub-200 ms confirmed
50k	file size	~49 MB

The gap between a full Zenoh scan and SQLite is roughly 50,000x. On a 36MB store, searching through Zenoh times out, while SQLite returns in under a millisecond even at 50k records.

Wrapping up

The short answer to "why does kioku-mesh use Zenoh" is: to build something that doesn't break when several machines write at the same time.

SQLite is fast and easy, but it lives inside a single machine. Adding Zenoh syncs data across machines and resolves write conflicts automatically through HLC. Reads stay fast because they come straight from SQLite. Zenoh does the "write and sync" job, SQLite does the "read fast" job, and that split is what makes the distributed long-term memory work.

kioku-mesh is open source.

h-wata / kioku-mesh

Shared memory for AI coding agents, across tools and machines. Local-first SQLite, optional Zenoh+RocksDB mesh, MCP-native.

Shared memory for AI coding agents, across tools and machines.

kioku (記憶) means memory.

kioku-mesh gives coding agents a shared memory store. Claude Code, Codex CLI Gemini CLI, and other MCP clients can save and search the same observations from one machine or from several machines on a trusted LAN/VPN mesh.

The default setup is local and needs no daemon. Mesh mode is available when you want the same memory pool replicated between hosts.

Why kioku-mesh

Coding-agent context gets fragmented across machines: which laptop did that work what did the agent on the other host decide, and why does a secondary agent have to re-read everything from scratch just to give a quick second opinion? kioku-mesh keeps that memory in one shared pool so any agent, on any of your machines, can recall it.

Unlike long-term memory tools that store everything in one place, the shared pool is a peer-to-peer…

View on GitHub

DEV Community