Imran Siddique

Posted on Jun 4 • Originally published at Medium on May 10

[Part 2] 20 Hard Questions About AI Agent Governance That Nobody Is Asking

#agentos #aigovernance #agenticai #scalebysubtraction

Part 2: Governing the Flock, Not Just the Bird

Current agent governance, including the initial versions of AGT , was built for a world of single-agent actions. But the scariest risks we are seeing in 2026 are emergent : swarms of individually compliant agents producing collectively dangerous behavior.

Collusion. Feedback loops. Race dynamics.

These don’t happen in a sandbox; they happen when “safe” agents interact in production. New research, like the SWARM framework and the Multi-Agent System Safety Standard (MASSS), is finally catching up. But the industry still lacks a runtime enforcement mechanism for collective behavior.

The Solution: Two Critical Architectural Moves

1. Policies Must Become Multi-Agent Aware

Today, a policy evaluates one agent’s one action. Tomorrow, it needs to evaluate across agents: “What are all agents doing collectively? Does this pattern, though individually permitted, look like a coordinated attack or market manipulation?”

2. Agent Mesh Needs Its Own Policy Layer

Agent OS governs individual actions. Agent Mesh needs policies governing inter-agent behavior. This is a distinct policy surface that handles the “handshake” between agents, ensuring that the intent of the sender matches the capability of the receiver.

Who is accountable in a delegation chain?

If Agent A delegates to Agent B, who delegates to Agent C, and Agent C causes harm, who do you fire? More importantly, who does the governance layer penalize?

My Principle: Accountability flows upward.

Agent B is responsible because B trusted C.
Agent A is responsible because A trusted B.

While Accountability flows upward, Harm Impac t flows downward: it’s highest at the point of execution (Agent C). To bridge this, delegating agents must inherit partial accountability for their delegates. If your delegate misbehaves, your trust score should decay alongside theirs.

Can agents from different organizations ever truly trust each other?

No. And they shouldn’t.

Agents should never fully trust each other, just like humans don’t. We need to move away from binary “Trust/No-Trust” states toward a Human Interaction Model :

Verify Identity : Is this agent who it says it is?
Verify Capability : Does it actually do what it claims?
Experience-Based Trust : Trust is earned through repeated, successful interactions.
Bounded Autonomy : Never give full trust; give only enough to get the specific task done.

Cross-org trust is not a shared global score. It is experiential and local. Each organization must build and maintain its own trust profile for external agents.

The “Illusion Delta”

The biggest trap of 2026 is the Illusion Delta , the gap between how safe an agent looks in a short-horizon interaction and how unstable it actually is when replayed at scale. Governance that doesn’t account for this delta isn’t governance; it’s a false sense of security.

What’s Next?

In Part 3, we move into the “Money” phase: Financial Governance. Your agents might be building a pricing cartel or spending $50K while you sleep. How do we govern the wallet?

Originally published at https://www.linkedin.com.

DEV Community