Michael "Mike" K. Saleme

Posted on May 26 • Edited on Jun 9

The EU AI Act Was Written for Models. Your Agents Need Runtime Compliance.

#ai #security #compliance #agents

The EU AI Act's high-risk obligations were due to apply on 2 August 2026. On 7 May 2026, the Council and Parliament agreed to move them: to 2 December 2027 for stand-alone high-risk systems under Annex III, and to 2 August 2028 for high-risk systems embedded in regulated products under Annex I. The agreement is provisional — pending formal adoption and Official Journal publication, expected before the original August date.

Read why it moved. The deferral is tied to the availability of the harmonised technical standards the regime runs on, and those standards are not ready. A compliance regime does not extend its own flagship deadline by sixteen months unless the evidence it asked for cannot yet be produced.

The evidence the Act was designed to evaluate — model cards, training-data lineage, evaluation suites, conformity assessments — describes a model artifact at rest. The systems enterprises are actually deploying are autonomous agents that use those models at runtime.

The extension is a confession about runtime evidence

The Act's runtime obligations already exist: automatic logging over the system lifecycle (Article 12), deployer monitoring (Article 26), post-market monitoring (Article 72). They were written for system-level events, not for the tool-call boundary where agent behavior actually lives. That boundary is what the harmonised standards have to make demonstrable — and it is the part no one has standardized.

The deadline did not slip because regulators went soft on autonomous AI. It slipped because the industry cannot yet produce the runtime evidence the Act demands — and the standards bodies know it.

The extension buys sixteen months. It does not change what the evidence has to show.

Three runtime gaps the act was not written to see

MCP transport and the sanitization handoff

Tool-protocol transports define the trust boundary between a model's intent and the system it is allowed to touch. When the transport assumes a trusted local context, every downstream control inherits that assumption.

The National Security Agency published MCP security guidance in May 2026 — the first time a national-security authority has issued an advisory on an agent tool protocol. The guidance addresses the STDIO transport design and the implicit local-trust assumption that flows from it. Anthropic confirmed the behavior is by design: sanitization is the integrating developer's responsibility.

That sentence is the entire compliance gap. The protocol shipped a default; the regulation evaluated the model; the boundary control is somebody else's problem. There are more than 200,000 MCP servers exposed, 30+ disclosures across the ecosystem, and ten CVEs across the official SDK languages. None of that surface is what a model-card evaluator looks at.

Anthropic's response to the systemic risk was MCP Tunnels and Self-Hosted Sandboxes, shipped May 19 as a limited research preview. Private-network MCP deployment is the right architectural direction. It is also a tacit acknowledgement that public-network MCP is not a posture an enterprise can take to a regulator.

x402 spend governance arrived from the security vendor, not the protocol

Payment-capable agents define a new category of compliance surface. The model can be perfectly aligned at training time and still authorize a spend at runtime that the deployer cannot defend in an audit.

x402, the HTTP 402 revival for agent-initiated payments, shipped without native spend governance. Fireblocks joined the x402 Foundation on May 20, 2026 and released a security extension that adds request integrity and spend-policy controls on top of the base protocol. AWS launched Bedrock AgentCore Payments in preview on May 7, with policy-based spend controls and an audit trail as managed-service primitives.

The pattern is consistent. The protocol shipped the capability; security and policy enforcement arrived as a second layer from vendors who took the runtime-control problem seriously. The AI Act will hold the deployer accountable for the spend, not the protocol for the omission.

Payment-tool authorization and the AP2/ACP interop layer

Agent-to-merchant payment flows define a new authorization surface that did not exist when the AI Act text was finalized. Stripe shipped its Link agent wallet and ACP/AP2 interop at Sessions 2026. The flows are real, the volume is small, and the standards are still being negotiated in public.

The compliance question is not whether the model approved the purchase. It is whether the operator can produce, on demand, a runtime audit trail that shows which constraint gated the authorization, which credential signed the request, and which policy revoked it when the spend pattern drifted.

That evidence is not produced by a model card. It is produced by the runtime, or it is not produced at all.

Static model audits and the time-shift problem

A model audit evaluates an artifact at training time. An agent violation occurs at tool-use time. The two events are separated by every input the agent will ever receive in production and every tool it will ever be granted.

A model card cannot tell you which MCP server an agent called yesterday at 3am. A conformity assessment cannot tell you which x402 endpoint absorbed an unbounded spend. A datasheet cannot show you which constraint failed open when a prompt injection rewrote the agent's plan.

Models pass audits at training time. Agents fail them at tool-use time. The deadline moved to December 2027; the gap did not move with it.

What runtime compliance actually looks like

The control surface the AI Act implies but does not specify has three layers.

Adversarial testing of the agent's tool surface. Not red-teaming the model — red-teaming the runtime composition. Prompt injection against the actual tool list. Spend-bound bypass against the actual x402 client. MCP transport abuse against the actual server set. The artifact under test is the deployed agent, not the model behind it.

Decision-gate governance with hard constraints. A constraint is not a system-prompt instruction. It is a runtime gate the agent cannot route around, with an amendment protocol that produces a paper trail when the constraint changes. The AI Act's high-risk obligations imply this; they do not specify the mechanism.

Runtime audit trails that survive a regulator's read. What was authorized, by which credential, against which policy, with what evidence — and when exposed credentials are detected, the operator's first move is to identify and revoke, not to rotate after the fact. Rotation assumes you already know what needs rotating. Identification is the regulatory question.

This is the work category I have been publishing under for the last year. The harness on PyPI as agent-security-harness runs adversarial coverage across MCP, A2A, x402, and L402 — pre-cert against AIUC-1 and aligned to NIST AI 800-2 — and grades every finding by evidence class, so an admission-time observation never reads as an enforcement guarantee. The governance package constitutional-agent carries six decision gates, twelve hard constraints, and an amendment protocol that produces a paper trail when a constraint changes. We have not found another open framework that covers both the adversarial-testing layer and the constitutional-governance layer as a paired stack. That is the point — the gap is real enough that the two layers had to be built separately and composed.

What the bifurcation actually splits

Vendors will claim model compliance. Most of those claims will be defensible at the artifact level, in the narrow technical sense the act evaluates. None of them will close the runtime gap, because the runtime is not the vendor's artifact — it is the deployer's composition.

The bifurcation is not about who is compliant. It is about which layer of the stack the compliance evidence describes. Deployers who can produce runtime audit trails, hard-constraint enforcement logs, and adversarial coverage reports will have evidence that maps to high-risk obligations. Deployers who can only forward a vendor's model card will have a document that describes something other than the system that actually shipped.

The clock reset to December 2027. The model evidence is the vendor's to provide; the runtime evidence is yours to produce.

DEV Community