DEV Community

Vilius
Vilius

Posted on

7 Protocols for Agent Infrastructure

I run about 20 AI agents. They delegate work to each other, deploy code, scan for vulnerabilities, and handle compliance checks. Over time, I kept hitting the same gaps — things that made autonomous workflows fragile in ways that took hours to debug.

I published a 7-layer model for agent infrastructure on how I think about these problems. Two layers have strong industry standards: Google's A2A protocol handles agent-to-agent coordination (L5), and Anthropic's MCP standardises how agents discover and use tools (L3–L4). At the identity layer, the W3C DID standard defines decentralised identifiers. For governance, there's the NIST AI Risk Management Framework.

The rest of the stack — the layers that make autonomous agents trustworthy, auditable, and production-safe — still has gaps. These seven protocols fill them. They're what I wired into my own fleet when the existing standards didn't go far enough.

All are CC BY 4.0. Five have live reference implementations. Two are spec'd but still in the works.

Industry Standards This Builds On

Standard Layer Organization
A2A Protocol L5 Coordination Google / a2aproject
Model Context Protocol L3–L4 Discovery + Session Anthropic
W3C DID Core L2 Communication W3C
NIST AI RMF L7 Governance NIST

1. Trust Score — Should I Delegate to This Agent?

When one of my agents delegates work to another, it needs to know if the target is reliable. Not "does it respond" — does it actually complete tasks correctly and consistently.

Weighted across success rate, pitfall history, skill quality, and uptime.

from workswithagents import TrustScoreClient

ts = TrustScoreClient()
if ts.get("target-agent")["tier"] == "trusted":
    delegate(task, to="target-agent")
Enter fullscreen mode Exit fullscreen mode

Spec


2. Deployment Manifest — Declare a Fleet, Deploy With One Command

I got tired of manually tracking which agents run where, how many instances, and what capabilities they have. One YAML file, one command.

fleet:
  name: "my-fleet"
  agents:
    - id: "builder"
      capabilities:
        - action: "build"
          target: "spfx"
      count: 3
Enter fullscreen mode Exit fullscreen mode
wwa fleet deploy fleet.yaml
Enter fullscreen mode Exit fullscreen mode

Spec


3. SLA Framework — Track Whether Agents Meet Their Promises

Three tiers: Best-Effort (free), Production (99.5% uptime, 90% task accuracy), Regulated (99.9% uptime, 95% accuracy, 7-year audit retention).

from workswithagents import SLAMetrics

sla = SLAMetrics("my-fleet", tier="production")
sla.report("agent-1", "task-42", duration_seconds=187, success=True)
status = sla.status()  # {breaches: [], status: "ok"}
Enter fullscreen mode Exit fullscreen mode

Spec


4. Handoff Protocol — Cryptographic Handoff Between Agents

When an agent passes a task to another, how do you know the output wasn't tampered with? Ed25519-signed handoffs with chain-of-custody verification. Built above MCP's tool-use layer.

from workswithagents import Handoff

h = Handoff(from_agent="planner", to_agent="scanner", payload={"plan": "..."})
signed = h.sign(planner_key)
verified = Handoff.verify(signed, planner_public_key)
Enter fullscreen mode Exit fullscreen mode

Spec


5. Identity Protocol — Verifiable Agent Identity

Cryptographic agent identity with Ed25519 keypairs. Signed messages. Verification against registry. Extends the W3C DID standard with agent-specific key management and fleet-scoped verification.

from workswithagents import AgentIdentity

ai = AgentIdentity("my-agent")
ai.register()
sig = ai.sign({"type": "heartbeat"})

valid = AgentIdentity.verify("other-agent", message, signature)
Enter fullscreen mode Exit fullscreen mode

6. Compliance-as-Code — Regulation as Executable Validation

NHS DTAC, FCA, GDS, GDPR — as rules agents can validate against at runtime. Extends frameworks like the NIST AI RMF from documentation into executable checks.

from workswithagents import ComplianceEngine

ce = ComplianceEngine()
dtac = ce.load("dtac-v2.1")

if dtac.validate(action).passed:
    execute(action)
else:
    escalate_to_human()
Enter fullscreen mode Exit fullscreen mode

Spec


7. Onboarding Protocol — Systematic Agent Creation

Interview → generate → calibrate → benchmark → register. Instead of writing a prompt file and hoping, run a pipeline that produces a scored agent.

from workswithagents import OnboardingClient

ob = OnboardingClient()
result = ob.full_onboard(
    "nhs-auditor",
    "Audit agent actions for NHS DTAC compliance",
    capabilities=["audit:compliance"],
    skills=["compliance-as-code"]
)
Enter fullscreen mode Exit fullscreen mode

The Stack

Where each protocol fits alongside existing industry standards:

L7 GOVERNANCE    ← NIST AI RMF           Compliance-as-Code · SLA Framework
L6 VERIFICATION  (no standard yet)       Agent Test Suite · Pitfall Registry
L5 COORDINATION  ← A2A (Google)          Trust Score
L4 SESSION       ← MCP (Anthropic)       Handoff Protocol
L3 DISCOVERY     ← MCP (Anthropic)       Trust Score · Capability Manifest
L2 COMMUNICATION ← W3C DID               Identity Protocol
L1 EXECUTION     (no standard yet)       Onboarding Protocol · Deployment Manifest
Enter fullscreen mode Exit fullscreen mode

A2A (Google) — agent-to-agent task coordination at L5. MCP (Anthropic) — tool discovery and context sharing at L3–L4. W3C DID — decentralised identity at L2. NIST AI RMF — governance framework at L7. These seven protocols fill what those standards leave open: trust, deployment, handoff integrity, compliance execution, and systematic agent creation.


Get Started

pip install workswithagents
Enter fullscreen mode Exit fullscreen mode

All specs: workswithagents.dev/specs/
All code: CC BY 4.0

Top comments (0)