Your Agent Called the Wrong Agent — On Purpose

#ai #security #multiagent #openclaw

You set up thirteen agents. You drew careful boundaries: coaching team over here, SaaS team over there, orchestrator bridges in between. Each agent has an allowAgents list — a whitelist of who it's allowed to talk to.

Then one of your agents just... called someone it wasn't supposed to. Not because of a bug in the routing. Because the model decided to.

The Setup

OpenClaw #63351 describes a multi-agent deployment with 13 agents organized into two teams. Agent vox is allowed to talk to sensei, maestro, and vigil. Agent wattson is not on vox's list.

What Happened

Vox was processing bug reports about a product called Wattson. Gemini 3 Pro saw the name in the content and inferred that agent Wattson was the right sessions_send target. The call went through — no error, no warning. The allowAgents config was completely ignored.

Two stacked failures:

The LLM inferred a target from content, not instructions
The gateway didn't enforce the boundary

Prompt-Based vs. Gateway-Enforced Security

The workaround: adding a blocklist to the agent's prompt. This works until it doesn't. Prompt-based security relies on model compliance, which is model-dependent, context-dependent, and adversarially fragile.

Gateway-enforced security is deterministic. The check passes or it doesn't.

If your silos are prompt-enforced only, you don't have silos — you have suggestions.

Takeaways

Don't trust prompt-based access control as your only gate
Test your framework's boundary enforcement actively
Log unauthorized cross-agent attempts
Treat agent-to-agent communication like network traffic — firewalls, not polite requests

DEV Community

Your Agent Called the Wrong Agent — On Purpose

The Setup

What Happened

Prompt-Based vs. Gateway-Enforced Security

Takeaways

Top comments (0)