Imagine you walk into a room full of strangers. Everyone is wearing a mask. Someone hands you a document and says "sign this — it's from the CEO." How do you know it's actually from the CEO? You don't. You have no way to verify identity.
This is the trust problem in multi-agent AI systems. And it's more serious than most teams realize.
Why Agents Need Identity
When a single AI agent operates in isolation, trust is simple — you trust it or you don't. But modern AI systems are increasingly multi-agent. A research agent delegates to a writing agent. A planning agent spawns execution sub-agents. An orchestrator coordinates dozens of specialized agents in parallel.
In these systems, every interaction between agents is a potential attack surface.
Consider this scenario: Agent A receives a message claiming to be from Agent B, instructing it to delete a set of records. How does Agent A verify that:
- The message actually came from Agent B
- Agent B is authorized to make that request
- Agent B hasn't been compromised and is acting within its intended scope
Without cryptographic identity, the answer to all three is: it can't.
The Solution: Cryptographic Agent Identity
The agent-governance-toolkit solves this with a trust mesh — a framework where every agent has a verifiable cryptographic identity, and every inter-agent interaction is validated before execution.
Here's how it works.
Ed25519 Key Pairs
Each agent gets an Ed25519 key pair at creation time:
from agent_os.identity import AgentIdentity
# Each agent gets a unique cryptographic identity
identity = AgentIdentity.create(
agent_id="research-agent-001",
role="researcher",
capabilities=["web_search", "read_file"],
)
print(identity.did)
# did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK
print(identity.public_key_hex)
# ed25519 public key
The DID (Decentralized Identifier) is a self-describing, globally unique identifier derived from the public key. No central registry required.
Signed Messages
When Agent A sends a message to Agent B, it signs it with its private key:
message = {
"from": identity.did,
"to": "did:key:z6Mk...",
"action": "analyze_dataset",
"payload": {"dataset_id": "ds_001"},
"timestamp": "2026-04-05T10:00:00Z",
}
signed_message = identity.sign(message)
# Includes: message + signature + public key
Agent B verifies the signature before acting:
from agent_os.identity import verify_message
is_valid = verify_message(signed_message)
if not is_valid:
raise SecurityError("Message signature invalid — rejecting")
If the message was tampered with in transit, the signature check fails. The action is rejected before execution.
Trust Scoring: Agents Earn and Lose Trust
Identity verification tells you who sent a message. Trust scoring tells you whether to act on it.
The toolkit maintains a trust score (0-1000) for each agent in the mesh:
from agent_os.trust import TrustEngine
trust = TrustEngine()
# New agent starts with baseline trust
score = trust.get_score("research-agent-001")
print(score) # 500 (baseline)
# Trust increases with successful verified interactions
trust.record_success("research-agent-001")
score = trust.get_score("research-agent-001")
print(score) # 510
# Trust decreases with policy violations
trust.record_violation("research-agent-001", severity="high")
score = trust.get_score("research-agent-001")
print(score) # 460
Agents with low trust scores get restricted automatically:
from agent_os.trust import TrustPolicy
policy = TrustPolicy(
minimum_score=400, # below this, agent is quarantined
require_human_approval=600, # below this, human must approve actions
full_autonomy=800, # above this, agent operates freely
)
This is how the system handles compromised agents gracefully. An agent that starts behaving badly loses trust, gets restricted, and eventually gets quarantined — automatically, without human intervention.
Delegation Chains: Limited Capability Grants
In multi-agent systems, parent agents often spawn child agents to handle subtasks. The challenge: how do you give a child agent enough capability to do its job without giving it unlimited power?
The answer is delegation chains — cryptographically signed capability grants with explicit limits:
from agent_os.delegation import DelegationChain
# Parent agent creates a limited delegation for child
delegation = DelegationChain.create(
parent_identity=orchestrator_identity,
child_did="did:key:z6Mk...",
granted_capabilities=["read_file"], # subset of parent's capabilities
excluded_capabilities=["delete_file", "send_email"],
max_depth=2, # child cannot further delegate more than 2 levels
expires_in_seconds=300, # delegation expires in 5 minutes
)
The child agent presents this delegation when making requests:
# Child agent acts with delegated authority
result = child_agent.execute(
action="read_file",
params={"path": "/data/analysis.csv"},
delegation=delegation,
)
The toolkit verifies the entire delegation chain before execution:
- Is the delegation cryptographically valid?
- Is the requested capability within the granted scope?
- Has the delegation expired?
- Is the delegation depth within limits?
If any check fails, the action is rejected. A compromised child agent cannot exceed the capabilities its parent explicitly granted.
A Practical Example: 3 Agents Collaborating
Here's how trust verification plays out in a real multi-agent workflow:
from agent_os.identity import AgentIdentity
from agent_os.trust import TrustEngine
from agent_os.delegation import DelegationChain
# Agent 1: Orchestrator (high trust, broad capabilities)
orchestrator = AgentIdentity.create(
agent_id="orchestrator",
role="orchestrator",
capabilities=["web_search", "read_file", "write_file", "send_email"],
)
# Agent 2: Researcher (medium trust, search only)
researcher = AgentIdentity.create(
agent_id="researcher",
role="researcher",
capabilities=["web_search", "read_file"],
)
# Agent 3: Writer (medium trust, write only)
writer = AgentIdentity.create(
agent_id="writer",
role="writer",
capabilities=["read_file", "write_file"],
)
trust = TrustEngine()
trust.set_score("orchestrator", 900)
trust.set_score("researcher", 700)
trust.set_score("writer", 700)
# Orchestrator delegates research task to researcher
research_delegation = DelegationChain.create(
parent_identity=orchestrator,
child_did=researcher.did,
granted_capabilities=["web_search"],
max_depth=1,
expires_in_seconds=600,
)
# Researcher executes with verified delegation
# Toolkit checks: valid signature + sufficient trust + within capability scope
result = researcher.execute(
action="web_search",
query="latest AI governance frameworks",
delegation=research_delegation,
)
# Orchestrator delegates writing task to writer
write_delegation = DelegationChain.create(
parent_identity=orchestrator,
child_did=writer.did,
granted_capabilities=["write_file"],
max_depth=1,
expires_in_seconds=600,
)
# Writer cannot exceed delegated scope
# This would be rejected — send_email not in delegation
writer.execute(
action="send_email",
delegation=write_delegation,
)
# SecurityError: capability 'send_email' not in delegation scope
Every interaction is verified. Every capability grant is explicit and time-limited. No agent can exceed what it was explicitly authorized to do.
Comparison With Human Trust Models
The cryptographic trust model mirrors how humans establish trust in high-stakes environments:
| Human World | Agent Mesh |
|---|---|
| Government-issued ID | Ed25519 key pair + DID |
| Signed contract | Signed message |
| Professional reputation | Trust score |
| Power of attorney | Delegation chain |
| Expiry date on credentials | TTL on delegations |
| Revocation list | Trust score below threshold |
The difference: human trust systems have gaps. IDs can be forged. Signatures can be disputed. Reputation takes months to establish. The cryptographic model is mathematically verifiable — either the signature is valid or it isn't. Either the capability was granted or it wasn't.
Why This Matters for Production Systems
Most teams building multi-agent systems today are operating on implicit trust — agents interact freely, permissions are not enforced, and there's no audit trail of inter-agent communication.
This works fine in development. It becomes a serious problem in production, where:
- Agents interact at scale across organizational boundaries
- Compromised agents can propagate bad behavior to other agents
- Regulatory requirements demand audit trails of automated decisions
- A single rogue agent in a mesh can corrupt the entire workflow
The zero-trust mesh approach — verify every identity, validate every capability, audit every interaction — is how you build multi-agent systems that are safe to run in production.
Getting Started
pip install agent-governance-toolkit[full]
Full source code and documentation:
👉 github.com/microsoft/agent-governance-toolkit
I'm Kanish Tyagi — MS Data Science student at UT Arlington, open source contributor to Microsoft's agent-governance-toolkit. Find me on GitHub and LinkedIn.
Top comments (0)