DEV Community

Cover image for I thought multi-agent meant more prompts until I saw 3 ways OpenClaw users are actually splitting the work
Lars Winstand
Lars Winstand

Posted on • Originally published at standardcompute.com

I thought multi-agent meant more prompts until I saw 3 ways OpenClaw users are actually splitting the work

I went into a bunch of OpenClaw discussions expecting the usual advice about subagents: better prompts, cleaner folders, maybe some heroic config.

What I found was more interesting.

The OpenClaw setups that actually seem to hold up are not just "one agent with more prompts." They are separate services with separate trust zones.

The pattern that keeps showing up looks like this:

  • a librarian agent
  • an executor agent
  • a company-facing agent

Usually connected over A2A.

That sounds like a small implementation detail. It is not.

A separate prompt inside one workspace is still one workspace:

  • one context blob
  • one tool surface
  • one security boundary
  • one place for bloat to accumulate

A separate OpenClaw instance is different. Now you have real boundaries:

  • different runtimes
  • different API keys
  • different networks
  • different memory policies
  • explicit handoffs

That is where multi-agent starts being architecture instead of roleplay.

The Reddit pattern is ahead of most blog posts

One of the clearest examples was an r/openclaw thread about an A2A plugin:

https://reddit.com/r/openclaw/comments/1t1yf86/i_made_an_openclaw_a2a_plugin_connect_your/

The post itself was small, but the use cases were sharp:

  1. a sandboxed local OpenClaw talking to a full-access cloud OpenClaw
  2. a personal OpenClaw talking to a company-wide OpenClaw for internal services
  3. teammate agents syncing plans over the internet to avoid stepping on each other

That is not prompt organization. That is system design.

And it answers the question I keep seeing from people trying to force multi-agent into one workspace:

Why not just keep everything in one OpenClaw workspace?

Because the boundary is the point.

If your librarian, executor, and company-facing assistant all live in the same workspace, a lot of the specialization is fake.

The librarian can still see too much.

The executor still inherits too much context.

The company-facing assistant is still one bad tool call away from touching something it should not.

Here is the tradeoff in plain terms:

Approach What actually happens
Separate A2A services Clear trust boundary, can run on different machines or networks, but setup and security overhead are real
Subagents inside one OpenClaw workspace Fast and simple, lower latency, but weaker isolation of tools and context and easier to bloat
n8n for orchestration plus agents for reasoning Great for deterministic triggers and data movement, but glue code gets messy fast

My opinionated take: multi-agent is only worth the complexity when the boundary is real.

If the split is just:

  • this prompt is the researcher
  • this prompt is the coder

then you probably do not have multiple agents. You have one agent wearing name tags.

The librarian pattern is better than it sounds

A commenter in that A2A thread described a pattern I think more teams should steal:

I need an agent that acts as a librarian and gatekeeper for a RAG implementation.

That is a strong design choice because it forces a question most agent stacks avoid:

Who is allowed to touch memory, and why?

A librarian agent can own retrieval and document selection.

It can decide:

  • which sources are valid
  • how much context to return
  • whether a query deserves a deep search
  • what gets filtered before it reaches the executor

Then your executor agent can stay focused on doing work instead of dragging your entire RAG stack into every session.

When a separate librarian makes sense

Use a dedicated librarian when:

  • retrieval needs its own rules
  • memory access should be restricted
  • different agents need different knowledge slices
  • you want to keep executor context small

When direct memory access is better

Keep it simple when:

  • everything is local
  • latency matters more than isolation
  • the same agent already owns the knowledge domain
  • you are adding A2A mostly because it sounds advanced

That tradeoff matters more than the label.

Not every boundary should become a network boundary.

But the useful ones usually should.

A practical split: one agent per trust boundary

The cleanest rule I found is this:

  • one agent per trust boundary
  • one agent per memory policy
  • one agent per tool class

That usually gives you something like this:

1. Librarian

Owns:

  • retrieval
  • indexing rules
  • memory access
  • document selection

2. Executor

Owns:

  • actions
  • code changes
  • task completion
  • narrow operational tools

3. Company-facing interface

Owns:

  • internal service access
  • approvals
  • policy enforcement
  • boring but critical guardrails

If two of those share the same tools, same memory, same runtime, and same risk profile, they probably should not be separate yet.

If they differ on any of those, split them.

What this looks like in practice

Here is a simple mental model:

[user/app]
   |
   v
[company-facing OpenClaw]
   |
   +--> [librarian OpenClaw] --> [docs/vector store]
   |
   +--> [executor OpenClaw] --> [repo/tools/shell]
Enter fullscreen mode Exit fullscreen mode

And here is the kind of split I would actually implement.

Company-facing agent

This is the only agent that talks to the outside world.

Responsibilities:

  • receive requests
  • check policy
  • decide whether work needs retrieval, execution, or both
  • redact or reshape requests before forwarding

Librarian agent

This agent gets read-only access to your knowledge systems.

Responsibilities:

  • search docs
  • fetch relevant chunks
  • summarize long context
  • return only what downstream agents need

Executor agent

This one gets the dangerous tools.

Responsibilities:

  • write code
  • run commands
  • modify files
  • execute workflows

That split avoids the worst anti-pattern: giving the same agent broad memory access and broad tool access and then hoping the prompt keeps it safe.

Security is where the fantasy ends

This is the first serious objection in every good A2A discussion, and it should be.

In that same A2A thread, someone pointed out the obvious risk: inbound calls can trigger OpenClaw tools.

That is not paranoia. That is basic engineering.

The plugin author responded with a few practical details:

  • secure-by-default posture
  • per-agent API keys
  • sender IDs
  • new conversation threads for each inbound message
  • Tailscale for receiving messages

They also suggested using a separate profile for experiments:

openclaw --profile gateway
Enter fullscreen mode Exit fullscreen mode

That is the right mindset.

A2A is not magic. It is distributed systems with LLMs attached.

Which means you inherit the normal taxes:

  • security tax
  • ops tax
  • debugging tax
  • latency tax

If you are not getting a real boundary in return, do not pay those taxes.

Add n8n carefully or you will build glue-code soup

Another useful OpenClaw thread described a setup with:

  • a shared VPS
  • multiple OpenClaw agents
  • n8n
  • local users connecting through Antigravity

Source:

https://reddit.com/r/openclaw/comments/1t0nnkz/am_i_overengineering_this_openclaw_n8n/

That architecture is not crazy.

But it gets messy fast if every system co-owns the workflow.

My rule of thumb:

  • let n8n handle deterministic flows, triggers, schedules, and integrations
  • let OpenClaw handle reasoning, exception handling, and ambiguous tasks
  • keep cross-service handoffs lower than your first instinct

A simple split looks like this:

n8n:
  owns:
    - cron jobs
    - webhooks
    - API integrations
    - retries

openclaw:
  owns:
    - planning
    - reasoning
    - ambiguous decisions
    - code generation
Enter fullscreen mode Exit fullscreen mode

If you make n8n, OpenClaw, and your local client all coordinate state, debugging gets ugly.

You end up tracing things like:

  1. OpenClaw A calls OpenClaw B
  2. OpenClaw B triggers n8n
  3. n8n writes state
  4. OpenClaw A no longer trusts the state it originally requested

That is not a model problem. That is orchestration debt.

The expensive part is often not the model

One of the most useful OpenClaw cost posts I found came from a user who spent about $850 in a month, including around $350 in one day:

https://reddit.com/r/openclaw/comments/1t2fd8o/spent_850_on_openclaw_in_a_month_350_in_one_day/

The key line was this:

At first I thought it was model cost. It wasn’t. It was bad system design.

That should be printed on a sticker and attached to every agent dashboard.

The fixes were not exotic:

  • strict context pruning
  • short sessions
  • n8n for repeat tasks
  • workspace cleanup

They reported 70 to 90 percent savings after redesigning the stack.

That matches what a lot of teams eventually learn:

The bill is not just about which model you picked.

It is about:

  • how much useless context you drag around
  • how often the wrong agent gets invoked
  • how many handoffs you created
  • how much deterministic work you let an LLM do

This is exactly why real boundaries matter.

A librarian agent can stay small.

An executor can stay sharp.

A company-facing agent can stay boring.

That is not architecture purity. That is cost control.

A minimal implementation sketch

If I were building this today, I would start with something like this.

1. Create isolated runtimes

openclaw --profile company
openclaw --profile librarian
openclaw --profile executor
Enter fullscreen mode Exit fullscreen mode

2. Give each runtime only the tools it needs

{
  "company": ["policy-check", "request-router"],
  "librarian": ["vector-search", "doc-fetch", "rerank"],
  "executor": ["git", "shell", "test-runner"]
}
Enter fullscreen mode Exit fullscreen mode

3. Keep the message contract small

{
  "task": "summarize auth flow docs relevant to OAuth token refresh bugs",
  "constraints": ["read-only", "max 10 chunks"],
  "request_id": "req_123"
}
Enter fullscreen mode Exit fullscreen mode

4. Return only what the next agent needs

{
  "request_id": "req_123",
  "summary": "Token refresh logic lives in auth-service and mobile-sdk",
  "sources": [
    "docs/auth/refresh-flow.md",
    "docs/mobile/oauth.md"
  ]
}
Enter fullscreen mode Exit fullscreen mode

That one habit alone prevents a lot of context bloat.

How I would decide whether to split an agent

Before creating a new agent, ask:

  1. should this component have different tool permissions?
  2. should this component have different memory access?
  3. should this component run in a different network or trust zone?
  4. would this split reduce context size in a meaningful way?
  5. can I explain the boundary without using the phrase "it feels cleaner"?

If the answer to most of those is no, keep one agent.

If the answer is yes, split it.

Where Standard Compute fits

There is one more practical issue here: once you start doing multi-agent properly, request volume goes up fast.

Not because you are being wasteful. Because clean architecture creates more small calls:

  • routing calls
  • retrieval calls
  • execution calls
  • retries
  • background automations

That is exactly where per-token pricing becomes annoying.

You stop optimizing for quality and start optimizing for what will not surprise you on the invoice.

For OpenClaw users running always-on agents, that is backwards.

Standard Compute is built for this exact situation:

  • unlimited AI compute for OpenClaw at a flat monthly price
  • no per-token billing
  • works with existing OpenClaw setups using a custom prompt
  • dynamic routing across GPT-5.4, Claude Opus 4.6, and Grok 4.20
  • plans from $9 to $399 per month

If your stack is moving from "one giant workspace" to actual multi-agent services, predictable cost matters a lot more than people admit.

Because the fastest way to ruin a good architecture is making developers afraid to let agents run.

The boring takeaway that will save you later

If you are building with OpenClaw, do not start with:

  • how many agents should I have?

Start with:

  1. which agent should know this?
  2. which agent should be allowed to do this?
  3. which agent should pay the context cost for this?

If all three answers point to the same place, keep it in one workspace.

If they do not, stop stuffing more prompts into one bot and calling it architecture.

That is the shift I keep seeing in OpenClaw discussions.

Not more agents for the sake of it.

Better boundaries.

Less context bloat.

Fewer surprise bills.

And systems that still make sense when they are running under pressure.

Top comments (0)