Kanish Tyagi

Posted on Apr 8

Building Trust Between AI Agents — DIDs, Signatures, and Zero-Trust Mesh

#agents #ai #security #systemdesign

Imagine you walk into a room full of strangers. Everyone is wearing a mask. Someone hands you a document and says "sign this — it's from the CEO." How do you know it's actually from the CEO? You don't. You have no way to verify identity.

This is the trust problem in multi-agent AI systems. And it's more serious than most teams realize.

Why Agents Need Identity

When a single AI agent operates in isolation, trust is simple — you trust it or you don't. But modern AI systems are increasingly multi-agent. A research agent delegates to a writing agent. A planning agent spawns execution sub-agents. An orchestrator coordinates dozens of specialized agents in parallel.

In these systems, every interaction between agents is a potential attack surface.

Consider this scenario: Agent A receives a message claiming to be from Agent B, instructing it to delete a set of records. How does Agent A verify that:

The message actually came from Agent B
Agent B is authorized to make that request
Agent B hasn't been compromised and is acting within its intended scope

Without cryptographic identity, the answer to all three is: it can't.

The Solution: Cryptographic Agent Identity

The agent-governance-toolkit solves this with a trust mesh — a framework where every agent has a verifiable cryptographic identity, and every inter-agent interaction is validated before execution.

Here's how it works.

Ed25519 Key Pairs

Each agent gets an Ed25519 key pair at creation time:

from agent_os.identity import AgentIdentity

# Each agent gets a unique cryptographic identity
identity = AgentIdentity.create(
    agent_id="research-agent-001",
    role="researcher",
    capabilities=["web_search", "read_file"],
)

print(identity.did)
# did:key:z6MkhaXgBZDvotDkL5257faiztiGiC2QtKLGpbnnEGta2doK

print(identity.public_key_hex)
# ed25519 public key

The DID (Decentralized Identifier) is a self-describing, globally unique identifier derived from the public key. No central registry required.

Signed Messages

When Agent A sends a message to Agent B, it signs it with its private key:

message = {
    "from": identity.did,
    "to": "did:key:z6Mk...",
    "action": "analyze_dataset",
    "payload": {"dataset_id": "ds_001"},
    "timestamp": "2026-04-05T10:00:00Z",
}

signed_message = identity.sign(message)
# Includes: message + signature + public key

Agent B verifies the signature before acting:

from agent_os.identity import verify_message

is_valid = verify_message(signed_message)
if not is_valid:
    raise SecurityError("Message signature invalid — rejecting")

If the message was tampered with in transit, the signature check fails. The action is rejected before execution.

Trust Scoring: Agents Earn and Lose Trust

Identity verification tells you who sent a message. Trust scoring tells you whether to act on it.

The toolkit maintains a trust score (0-1000) for each agent in the mesh:

from agent_os.trust import TrustEngine

trust = TrustEngine()

# New agent starts with baseline trust
score = trust.get_score("research-agent-001")
print(score)  # 500 (baseline)

# Trust increases with successful verified interactions
trust.record_success("research-agent-001")
score = trust.get_score("research-agent-001")
print(score)  # 510

# Trust decreases with policy violations
trust.record_violation("research-agent-001", severity="high")
score = trust.get_score("research-agent-001")
print(score)  # 460

Agents with low trust scores get restricted automatically:

from agent_os.trust import TrustPolicy

policy = TrustPolicy(
    minimum_score=400,          # below this, agent is quarantined
    require_human_approval=600, # below this, human must approve actions
    full_autonomy=800,          # above this, agent operates freely
)

This is how the system handles compromised agents gracefully. An agent that starts behaving badly loses trust, gets restricted, and eventually gets quarantined — automatically, without human intervention.

Delegation Chains: Limited Capability Grants

In multi-agent systems, parent agents often spawn child agents to handle subtasks. The challenge: how do you give a child agent enough capability to do its job without giving it unlimited power?

The answer is delegation chains — cryptographically signed capability grants with explicit limits:

from agent_os.delegation import DelegationChain

# Parent agent creates a limited delegation for child
delegation = DelegationChain.create(
    parent_identity=orchestrator_identity,
    child_did="did:key:z6Mk...",
    granted_capabilities=["read_file"],  # subset of parent's capabilities
    excluded_capabilities=["delete_file", "send_email"],
    max_depth=2,          # child cannot further delegate more than 2 levels
    expires_in_seconds=300,  # delegation expires in 5 minutes
)

The child agent presents this delegation when making requests:

# Child agent acts with delegated authority
result = child_agent.execute(
    action="read_file",
    params={"path": "/data/analysis.csv"},
    delegation=delegation,
)

The toolkit verifies the entire delegation chain before execution:

Is the delegation cryptographically valid?
Is the requested capability within the granted scope?
Has the delegation expired?
Is the delegation depth within limits?

If any check fails, the action is rejected. A compromised child agent cannot exceed the capabilities its parent explicitly granted.

A Practical Example: 3 Agents Collaborating

Here's how trust verification plays out in a real multi-agent workflow:

from agent_os.identity import AgentIdentity
from agent_os.trust import TrustEngine
from agent_os.delegation import DelegationChain

# Agent 1: Orchestrator (high trust, broad capabilities)
orchestrator = AgentIdentity.create(
    agent_id="orchestrator",
    role="orchestrator",
    capabilities=["web_search", "read_file", "write_file", "send_email"],
)

# Agent 2: Researcher (medium trust, search only)
researcher = AgentIdentity.create(
    agent_id="researcher",
    role="researcher",
    capabilities=["web_search", "read_file"],
)

# Agent 3: Writer (medium trust, write only)
writer = AgentIdentity.create(
    agent_id="writer",
    role="writer",
    capabilities=["read_file", "write_file"],
)

trust = TrustEngine()
trust.set_score("orchestrator", 900)
trust.set_score("researcher", 700)
trust.set_score("writer", 700)

# Orchestrator delegates research task to researcher
research_delegation = DelegationChain.create(
    parent_identity=orchestrator,
    child_did=researcher.did,
    granted_capabilities=["web_search"],
    max_depth=1,
    expires_in_seconds=600,
)

# Researcher executes with verified delegation
# Toolkit checks: valid signature + sufficient trust + within capability scope
result = researcher.execute(
    action="web_search",
    query="latest AI governance frameworks",
    delegation=research_delegation,
)

# Orchestrator delegates writing task to writer
write_delegation = DelegationChain.create(
    parent_identity=orchestrator,
    child_did=writer.did,
    granted_capabilities=["write_file"],
    max_depth=1,
    expires_in_seconds=600,
)

# Writer cannot exceed delegated scope
# This would be rejected — send_email not in delegation
writer.execute(
    action="send_email",
    delegation=write_delegation,
)
# SecurityError: capability 'send_email' not in delegation scope

Every interaction is verified. Every capability grant is explicit and time-limited. No agent can exceed what it was explicitly authorized to do.

Comparison With Human Trust Models

The cryptographic trust model mirrors how humans establish trust in high-stakes environments:

Human World	Agent Mesh
Government-issued ID	Ed25519 key pair + DID
Signed contract	Signed message
Professional reputation	Trust score
Power of attorney	Delegation chain
Expiry date on credentials	TTL on delegations
Revocation list	Trust score below threshold

The difference: human trust systems have gaps. IDs can be forged. Signatures can be disputed. Reputation takes months to establish. The cryptographic model is mathematically verifiable — either the signature is valid or it isn't. Either the capability was granted or it wasn't.

Why This Matters for Production Systems

Most teams building multi-agent systems today are operating on implicit trust — agents interact freely, permissions are not enforced, and there's no audit trail of inter-agent communication.

This works fine in development. It becomes a serious problem in production, where:

Agents interact at scale across organizational boundaries
Compromised agents can propagate bad behavior to other agents
Regulatory requirements demand audit trails of automated decisions
A single rogue agent in a mesh can corrupt the entire workflow

The zero-trust mesh approach — verify every identity, validate every capability, audit every interaction — is how you build multi-agent systems that are safe to run in production.

Getting Started

pip install agent-governance-toolkit[full]

Full source code and documentation:
👉 github.com/microsoft/agent-governance-toolkit

I'm Kanish Tyagi — MS Data Science student at UT Arlington, open source contributor to Microsoft's agent-governance-toolkit. Find me on GitHub and LinkedIn.

DEV Community