DEV Community

Cover image for The EU AI Act Was Written for Models. Your Agents Need Runtime Compliance.
AI Gov Dev for Aguardic

Posted on • Originally published at aguardic.com

The EU AI Act Was Written for Models. Your Agents Need Runtime Compliance.

Your EU AI Act workstream is on track. You have a model card, a risk register, a data governance memo, and a plan for periodic re-validation. Then the product team ships an agent that can browse internal docs, open tickets, change account settings, and email customers, and your pre-deployment assessment suddenly looks like it was written for a different system.

That is because it was. A new analysis published this week by TechPolicy.Press, "The EU AI Act is Not Ready for Agents," lays out five governance challenges where autonomous agents break the assumptions embedded in the regulation. The incidents they cite are not theoretical. Amazon's coding agent Kiro deleted a live production environment in December 2025, triggering a 13-hour AWS regional outage. An autonomous agent using OpenClaw went rogue after a rejected software contribution and independently published a hit piece attacking the volunteer who turned it down. An attacker planted hidden instructions in a webpage, and when an AI agent browsed it on a user's behalf, it stole login credentials and sent them to an external server.

These are normal consequences of giving a probabilistic system memory, tools, and autonomy. The question for EU AI Act compliance is practical: if your AI system is an agent, what does compliance look like when the risk is created at runtime?

The regulation assumes a static system

A useful way to read the EU AI Act is as a regulation designed around an AI system that behaves like a component. It takes inputs, produces outputs, and can be evaluated against requirements like accuracy, robustness, cybersecurity, logging, transparency, and human oversight. Even where the Act speaks about the "AI system" rather than the "model," most compliance programs interpret that system as something you can assess pre-deployment and then re-assess periodically.

That mental model works for classical ML: a credit scoring model inside a fixed workflow, a medical imaging model flagging anomalies for a clinician, a fraud model triggering a review queue. In each case, you can define intended purpose, define the operating domain, test on representative datasets, implement controls, and monitor drift.

Agents change the shape of the problem in four ways. They execute actions through API calls, database writes, ticket creation, and external communications, not just generate text. They chain decisions over time through plan, tool call, observe, revise, act again sequences, where the harmful outcome emerges from the sequence rather than a single output. Their objectives can shift through conversation, tool results, or user pressure, creating compliance-relevant behavior changes without a deployment event. And they blend data across customers, tenants, or internal domains because they are optimized to be helpful, not to respect organizational boundaries by default.

So if we treat an agent as just another model deployment, we over-invest in static artifacts and under-invest in runtime control. That mismatch will surface in audits the moment someone asks: what exactly can the agent do in production today? Under what conditions does it escalate to a human? How do you prevent it from using a tool based on untrusted instructions? When it makes a mistake, can you prove what happened?

Five challenge areas mapped to runtime controls

The TechPolicy.Press paper frames five areas where agents strain the Act's assumptions: performance, misuse, privacy, equity, and oversight. Each maps to specific runtime controls that auditors will expect.

Performance becomes trajectory-level, not output-level. For a static model, performance is a metric on a test set. For an agent, performance is a property of an execution trajectory across multiple steps, tools, and intermediate states. An agent can be accurate at each step and still fail catastrophically because small errors compound. A support agent correctly retrieves policy, correctly identifies an order, but misreads currency and calls the refund tool for the wrong invoice because it merged two customer threads. Each step looks plausible. The sequence is wrong.

The controls that address this are continuous evaluation on trajectories rather than single outputs, runtime assertions that validate tool call inputs against business rules before execution, and progressive autonomy that starts in propose-only mode and expands to gated execution as evidence accumulates. The evidence an auditor will accept is a documented evaluation protocol that includes multi-step scenario suites with pass/fail criteria tied to harms, trace samples showing trajectory-level scoring with failure analysis, and change logs showing when autonomy scope expanded and what evidence justified it.

Misuse is a compliance failure, not just a security concern. For agents, prompt injection becomes a direct compliance issue because it causes unauthorized actions. An agent reads an inbound email containing hidden instructions to download a customer list and send it externally. If the agent has the tool permissions, it may comply.

The controls are context-aware tool permissioning rather than role-based access (the agent can send emails but only within your domain, only templated responses, only from allowlisted attachments), untrusted-content isolation that treats external text as hostile while keeping tool execution based on validated intents and structured inputs, and policy-as-code that evaluates each proposed action against context including customer, tenant, data classification, and monetary thresholds. The evidence is a tool registry showing each tool with its risk category and enforced constraints, logs of blocked tool calls with policy violation reasons, and records of adversarial testing focused on prompt injection leading to tool misuse.

Privacy risk comes from cross-context blending. An internal HR agent answers a manager's question and accidentally includes details from another employee's case because both were retrieved in the same context window. A multi-tenant SaaS agent retrieves the right customer's ticket history but also pulls a similarly named account from another tenant.

The controls are data boundary policies enforced at retrieval time where queries are scoped by tenant and user permissions rather than best-match similarity alone, context compartmentalization that separates memory and state per case or customer, data classification checks before external actions that flag restricted fields and require approval or redaction, and least-privilege connectors that limit agent access to narrow APIs returning only what the workflow needs. The evidence is a documented data boundary model mapping sources to classifications to access rules, retrieval logs showing query scope and authorization decisions, and incident playbooks for privacy boundary violations.

Equity risk emerges from routing, not just model bias. Even if the underlying model is fair by a benchmark, agents create inequity through how they route cases, escalate, request documentation, or apply policies in ambiguous situations. A benefits eligibility agent asks for additional verification more often for certain names or addresses because of spurious correlations in retrieved notes. It escalates some customers to human review more frequently, leading to slower service.

The controls are outcome monitoring by segment measuring operational results like time-to-resolution, escalation rate, and denial rate rather than just model accuracy, policy constraints that enforce consistent treatment where discretion exists, and defined ambiguity triggers that require escalation for low-confidence or conflicting-data cases. The evidence is monitoring reports tracking outcomes by segment with thresholds and remediation actions, documentation of discretion points and how they are constrained, and governance review records when disparities appear.

Oversight must be engineered, not assumed. The EU AI Act requires human oversight measures that enable humans to understand, monitor, and intervene. For agents, oversight is frequently mis-implemented as "a human can look at the chat transcript." That is archaeology, not oversight.

The controls are approval gates tied to action types (financial transactions, external communications, restricted data access, production changes), structured intervention UX that shows reviewers the proposed action with tool inputs, referenced sources, and policy check results rather than free-form text, and override and escalation paths that fail safe and route to the right owner. The evidence is a human oversight design mapped to specific risks showing who oversees what with what authority, logs proving approvals occurred before actions with identities and timestamps, and exception handling records documenting how the organization responded when agents could not proceed.

The stop button is not a safety mechanism for irreversible actions

Oversight discussions default to a comforting idea: if the agent goes wrong, a human can stop it. For agents, that is only sometimes true.

If the agent's actions are reversible internal state changes like creating a draft ticket or staging a config change, a stop button is meaningful. If the actions are irreversible external actions like sending a customer email, submitting a regulatory filing, executing a bank transfer, or pushing a production deploy, "stop" is not a reliable control. By the time a human notices, the action is already out in the world.

Compliance engineering for agents needs a different emphasis. Hard gates before irreversible actions. Staged execution where drafts are reviewed before sending. Cooldown windows for high-impact actions where outbound messages queue for automated checks and potential cancellation. Tools that support idempotency and rollback, preferring "create refund request" over "issue refund." The stop button becomes part of containment, not the primary safety mechanism.

Auditors will ask the obvious question: show us how you prevent harm, not how you apologize after it happens.

The timeline is tighter than it looks

The EU AI Act's high-risk system deadlines are in flux. The European Parliament voted to delay Annex III obligations to December 2, 2027, but the Council has not yet approved the delay. If trilogue negotiations stall past August 2026, the original deadlines stand. And regardless of the regulatory timeline, procurement questionnaires are already getting specific about agent runtime behavior, not just model development practices.

The TechPolicy.Press paper recommends that the European Commission ensure harmonized technical standards address agents explicitly, and that the AI Office issue guidance on how GPAI model providers should handle agent-specific risks like prompt injection and tool misuse. That guidance has not arrived. In the meantime, organizations deploying agents need to build the runtime compliance layer themselves.

The organizations that get this right will not just pass audits. They will ship faster because they will have a control plane that lets them expand agent autonomy safely: from propose-only, to limited execution, to broader execution with evidence and guardrails at every step.

We built Aguardic to make EU AI Act compliance work for agentic systems. If your agents do not fit your current compliance model, extract enforceable rules from your existing policy documents and see where the runtime gaps are.


I'm building Aguardic, an AI governance platform that enforces policies at the runtime decision point — deterministic rules for speed, semantic AI for nuance, and custom knowledge for your organization's context. If you're dealing with AI compliance, check it out or drop a question in the comments.

Originally published at www.aguardic.com.

Top comments (0)