Lars Winstand

Posted on May 3 • Originally published at standardcompute.com

I thought multi-agent meant more prompts until I saw 3 ways OpenClaw users are actually splitting the work

#ai #agents #openclaw #devops

I went into a bunch of OpenClaw discussions expecting the usual advice about subagents: better prompts, cleaner folders, maybe some heroic config.

What I found was more interesting.

The OpenClaw setups that actually seem to hold up are not just "one agent with more prompts." They are separate services with separate trust zones.

The pattern that keeps showing up looks like this:

a librarian agent
an executor agent
a company-facing agent

Usually connected over A2A.

That sounds like a small implementation detail. It is not.

A separate prompt inside one workspace is still one workspace:

one context blob
one tool surface
one security boundary
one place for bloat to accumulate

A separate OpenClaw instance is different. Now you have real boundaries:

different runtimes
different API keys
different networks
different memory policies
explicit handoffs

That is where multi-agent starts being architecture instead of roleplay.

The Reddit pattern is ahead of most blog posts

One of the clearest examples was an r/openclaw thread about an A2A plugin:

https://reddit.com/r/openclaw/comments/1t1yf86/i_made_an_openclaw_a2a_plugin_connect_your/

The post itself was small, but the use cases were sharp:

a sandboxed local OpenClaw talking to a full-access cloud OpenClaw
a personal OpenClaw talking to a company-wide OpenClaw for internal services
teammate agents syncing plans over the internet to avoid stepping on each other

That is not prompt organization. That is system design.

And it answers the question I keep seeing from people trying to force multi-agent into one workspace:

Why not just keep everything in one OpenClaw workspace?

Because the boundary is the point.

If your librarian, executor, and company-facing assistant all live in the same workspace, a lot of the specialization is fake.

The librarian can still see too much.

The executor still inherits too much context.

The company-facing assistant is still one bad tool call away from touching something it should not.

Here is the tradeoff in plain terms:

Approach	What actually happens
Separate A2A services	Clear trust boundary, can run on different machines or networks, but setup and security overhead are real
Subagents inside one OpenClaw workspace	Fast and simple, lower latency, but weaker isolation of tools and context and easier to bloat
n8n for orchestration plus agents for reasoning	Great for deterministic triggers and data movement, but glue code gets messy fast

My opinionated take: multi-agent is only worth the complexity when the boundary is real.

If the split is just:

this prompt is the researcher
this prompt is the coder

then you probably do not have multiple agents. You have one agent wearing name tags.

The librarian pattern is better than it sounds

A commenter in that A2A thread described a pattern I think more teams should steal:

I need an agent that acts as a librarian and gatekeeper for a RAG implementation.

That is a strong design choice because it forces a question most agent stacks avoid:

Who is allowed to touch memory, and why?

A librarian agent can own retrieval and document selection.

It can decide:

which sources are valid
how much context to return
whether a query deserves a deep search
what gets filtered before it reaches the executor

Then your executor agent can stay focused on doing work instead of dragging your entire RAG stack into every session.

When a separate librarian makes sense

Use a dedicated librarian when:

retrieval needs its own rules
memory access should be restricted
different agents need different knowledge slices
you want to keep executor context small

When direct memory access is better

Keep it simple when:

everything is local
latency matters more than isolation
the same agent already owns the knowledge domain
you are adding A2A mostly because it sounds advanced

That tradeoff matters more than the label.

Not every boundary should become a network boundary.

But the useful ones usually should.

A practical split: one agent per trust boundary

The cleanest rule I found is this:

one agent per trust boundary
one agent per memory policy
one agent per tool class

That usually gives you something like this:

1. Librarian

Owns:

retrieval
indexing rules
memory access
document selection

2. Executor

Owns:

actions
code changes
task completion
narrow operational tools

3. Company-facing interface

Owns:

internal service access
approvals
policy enforcement
boring but critical guardrails

If two of those share the same tools, same memory, same runtime, and same risk profile, they probably should not be separate yet.

If they differ on any of those, split them.

What this looks like in practice

Here is a simple mental model:

[user/app]
   |
   v
[company-facing OpenClaw]
   |
   +--> [librarian OpenClaw] --> [docs/vector store]
   |
   +--> [executor OpenClaw] --> [repo/tools/shell]

And here is the kind of split I would actually implement.

Company-facing agent

This is the only agent that talks to the outside world.

Responsibilities:

receive requests
check policy
decide whether work needs retrieval, execution, or both
redact or reshape requests before forwarding

Librarian agent

This agent gets read-only access to your knowledge systems.

Responsibilities:

search docs
fetch relevant chunks
summarize long context
return only what downstream agents need

Executor agent

This one gets the dangerous tools.

Responsibilities:

write code
run commands
modify files
execute workflows

That split avoids the worst anti-pattern: giving the same agent broad memory access and broad tool access and then hoping the prompt keeps it safe.

Security is where the fantasy ends

This is the first serious objection in every good A2A discussion, and it should be.

In that same A2A thread, someone pointed out the obvious risk: inbound calls can trigger OpenClaw tools.

That is not paranoia. That is basic engineering.

The plugin author responded with a few practical details:

secure-by-default posture
per-agent API keys
sender IDs
new conversation threads for each inbound message
Tailscale for receiving messages

They also suggested using a separate profile for experiments:

openclaw --profile gateway

That is the right mindset.

A2A is not magic. It is distributed systems with LLMs attached.

Which means you inherit the normal taxes:

security tax
ops tax
debugging tax
latency tax

If you are not getting a real boundary in return, do not pay those taxes.

Add n8n carefully or you will build glue-code soup

Another useful OpenClaw thread described a setup with:

a shared VPS
multiple OpenClaw agents
n8n
local users connecting through Antigravity

Source:

https://reddit.com/r/openclaw/comments/1t0nnkz/am_i_overengineering_this_openclaw_n8n/

That architecture is not crazy.

But it gets messy fast if every system co-owns the workflow.

My rule of thumb:

let n8n handle deterministic flows, triggers, schedules, and integrations
let OpenClaw handle reasoning, exception handling, and ambiguous tasks
keep cross-service handoffs lower than your first instinct

A simple split looks like this:

n8n:
  owns:
    - cron jobs
    - webhooks
    - API integrations
    - retries

openclaw:
  owns:
    - planning
    - reasoning
    - ambiguous decisions
    - code generation

If you make n8n, OpenClaw, and your local client all coordinate state, debugging gets ugly.

You end up tracing things like:

OpenClaw A calls OpenClaw B
OpenClaw B triggers n8n
n8n writes state
OpenClaw A no longer trusts the state it originally requested

That is not a model problem. That is orchestration debt.

The expensive part is often not the model

One of the most useful OpenClaw cost posts I found came from a user who spent about $850 in a month, including around $350 in one day:

https://reddit.com/r/openclaw/comments/1t2fd8o/spent_850_on_openclaw_in_a_month_350_in_one_day/

The key line was this:

At first I thought it was model cost. It wasn’t. It was bad system design.

That should be printed on a sticker and attached to every agent dashboard.

The fixes were not exotic:

strict context pruning
short sessions
n8n for repeat tasks
workspace cleanup

They reported 70 to 90 percent savings after redesigning the stack.

That matches what a lot of teams eventually learn:

The bill is not just about which model you picked.

It is about:

how much useless context you drag around
how often the wrong agent gets invoked
how many handoffs you created
how much deterministic work you let an LLM do

This is exactly why real boundaries matter.

A librarian agent can stay small.

An executor can stay sharp.

A company-facing agent can stay boring.

That is not architecture purity. That is cost control.

A minimal implementation sketch

If I were building this today, I would start with something like this.

1. Create isolated runtimes

openclaw --profile company
openclaw --profile librarian
openclaw --profile executor

2. Give each runtime only the tools it needs

{
  "company": ["policy-check", "request-router"],
  "librarian": ["vector-search", "doc-fetch", "rerank"],
  "executor": ["git", "shell", "test-runner"]
}

3. Keep the message contract small

{
  "task": "summarize auth flow docs relevant to OAuth token refresh bugs",
  "constraints": ["read-only", "max 10 chunks"],
  "request_id": "req_123"
}

4. Return only what the next agent needs

{
  "request_id": "req_123",
  "summary": "Token refresh logic lives in auth-service and mobile-sdk",
  "sources": [
    "docs/auth/refresh-flow.md",
    "docs/mobile/oauth.md"
  ]
}

That one habit alone prevents a lot of context bloat.

How I would decide whether to split an agent

Before creating a new agent, ask:

should this component have different tool permissions?
should this component have different memory access?
should this component run in a different network or trust zone?
would this split reduce context size in a meaningful way?
can I explain the boundary without using the phrase "it feels cleaner"?

If the answer to most of those is no, keep one agent.

If the answer is yes, split it.

Where Standard Compute fits

There is one more practical issue here: once you start doing multi-agent properly, request volume goes up fast.

Not because you are being wasteful. Because clean architecture creates more small calls:

routing calls
retrieval calls
execution calls
retries
background automations

That is exactly where per-token pricing becomes annoying.

You stop optimizing for quality and start optimizing for what will not surprise you on the invoice.

For OpenClaw users running always-on agents, that is backwards.

Standard Compute is built for this exact situation:

unlimited AI compute for OpenClaw at a flat monthly price
no per-token billing
works with existing OpenClaw setups using a custom prompt
dynamic routing across GPT-5.4, Claude Opus 4.6, and Grok 4.20
plans from $9 to $399 per month

If your stack is moving from "one giant workspace" to actual multi-agent services, predictable cost matters a lot more than people admit.

Because the fastest way to ruin a good architecture is making developers afraid to let agents run.

The boring takeaway that will save you later

If you are building with OpenClaw, do not start with:

how many agents should I have?

Start with:

which agent should know this?
which agent should be allowed to do this?
which agent should pay the context cost for this?

If all three answers point to the same place, keep it in one workspace.

If they do not, stop stuffing more prompts into one bot and calling it architecture.

That is the shift I keep seeing in OpenClaw discussions.

Not more agents for the sake of it.

Better boundaries.

Less context bloat.

Fewer surprise bills.

And systems that still make sense when they are running under pressure.

DEV Community

I thought multi-agent meant more prompts until I saw 3 ways OpenClaw users are actually splitting the work

The Reddit pattern is ahead of most blog posts

Why not just keep everything in one OpenClaw workspace?

The librarian pattern is better than it sounds

When a separate librarian makes sense

When direct memory access is better

A practical split: one agent per trust boundary

1. Librarian

2. Executor

3. Company-facing interface

What this looks like in practice

Company-facing agent

Librarian agent

Executor agent

Security is where the fantasy ends

Add n8n carefully or you will build glue-code soup

The expensive part is often not the model

A minimal implementation sketch

1. Create isolated runtimes

2. Give each runtime only the tools it needs

3. Keep the message contract small

4. Return only what the next agent needs

How I would decide whether to split an agent

Where Standard Compute fits

The boring takeaway that will save you later

Top comments (0)