What building a multi-agent runtime taught me about isolation and data leaks

Marcel Wege — Fri, 05 Jun 2026 08:00:05 +0000

The model was the easy part.

Prompting, tool-calling, getting usable output back: mostly solved, mostly boring. What cost me weeks was everything around the model. Where memory lives. What a tool is allowed to hand back. Whether a non-engineer can build any of it without me.

This is from building omadia, an agent runtime we run in production and ship as open source. The point of it isn't "more AI magic." It's the control layer under the magic. Your data stays on your own infrastructure, you can audit what the agents do, and you decide what they are allowed to touch.

Memory between agents bleeds, and you notice too late

One agent, no problem. Several agents sharing one memory store, and context starts crossing wires: agent A "knows" something only agent B was ever told. It is intermittent, it looks like a hallucination, and it quietly burns trust once different agents serve different people.

What fixed it was access-aware memory. Each agent and each orchestrator gets its own namespace and its own slice of the graph, and the routing decision happens where the channel binding resolves, not five layers deep in the agent. An agent reads its own context or nothing.

// scope every read/write to the agent that owns the turn
const mem = memoryFor(agent.id);   // isolated namespace + graph slice
await mem.write(turn);
const context = await mem.recall(query);

The cost: sharing on purpose is now the harder path. That is the right trade. Isolation should be the default you opt out of, not the thing you remember to add.

The leak is in the tool output, not the prompt

Everybody guards the prompt. The leak that actually got me came out the other side, in tool results.

A tool returns a tidy "summary plus details" object. Buried in the details is a field, sometimes PII, that the current agent has no business seeing. Nobody injected anything. The data just rode along in the response and the model repeated it.

So tool output became an untrusted boundary, the same way user input is. omadia redacts results per record before the model sees them and restores the full data only where the access actually allows it. If you build around Claude or ChatGPT, this is the piece people skip and then regret.

The builder I was proud of was the one people abandoned

This is the part I got wrong. I built a full agent builder, every option exposed, and handed it to the non-engineers it was meant for. They opened it once and left.

The version that worked was smaller. Describe the agent in plain language, connect a couple of tools, watch it run in a live preview next to the description. Same engine underneath. The power-user surface still exists for people who want it. It just stopped being the front door.

For a builder, the default screen is a product decision, not a config flag. Aim it at the least technical person you actually want to keep.

The plugin seams decide whether any of this scales

omadia is built from plugins, and the seam that earns its keep is the channel: where the agent actually talks. A conversation starts in Telegram, moves to MS Teams, and the agent keeps its context across the jump. In an enterprise that handoff matters more than any single feature, because Teams is where the work already happens. Integrations and capability providers like graph memory plug onto the same seams.

Get those seams wrong early and everything downstream is rework. I know, because I did.

Two questions I am still arguing with myself about

How would you scope memory between orchestrators? Strict per-orchestrator namespaces stay clean but make deliberate sharing the awkward case.
Do you redact tool output before or after expanding the record? I expand first, then gate per field. The ordering has real consequences.