DEV Community

Cover image for EU AI Act Starts Aug 2026 - A Practical Checklist for AI Agent Developers
Ilya Denisov
Ilya Denisov

Posted on

EU AI Act Starts Aug 2026 - A Practical Checklist for AI Agent Developers

The EU AI Act's high-risk AI system requirements take effect on August 2, 2026. If you're building AI agents that make decisions affecting people -- purchasing, customer service, hiring, content moderation -- this applies to you.

Fines: up to EUR35 million or 7% of global revenue.

I'm not a lawyer, but I've read the regulation and built tooling around it. Here's what developers actually need to do, with code examples.


What Article 14 Requires (Plain English)

Article 14 is about Human Oversight. In summary:

Requirement What It Means for Developers
Understand capabilities and limitations Log what the agent can and can't do
Monitor operation and detect anomalies Record every decision, detect failures
Interpret outputs correctly Show why the agent made each decision
Decide not to use or override Allow humans to block actions
Intervene or interrupt Detect and flag instruction changes

The common thread: you need a record of what your agent decided, why, and whether anything went wrong.


The Checklist

1. Record Every Decision Point

Not just inputs and outputs -- record why the agent chose each action.

# [BAD] Insufficient
logger.info(f"Agent called tool: {tool_name}")

# [GOOD] What auditors want to see
{
    "timestamp": "2026-03-29T10:15:32Z",
    "event_type": "decision",
    "action": "purchase_product",
    "input": {"product": "Logitech M750", "price": 45.00},
    "reasoning": "Cheapest option matching user's 'wireless mouse' query",
    "agent_id": "shopping-agent",
    "session_id": "order-123"
}
Enter fullscreen mode Exit fullscreen mode

2. Track Which External Data Influenced Decisions

If your agent uses RAG, memory, or retrieved documents, log which documents were used and how relevant they were.

{
    "event_type": "context_injection",
    "source": "vector_db",
    "content": {
        "document": "refund_policy_v2.md",
        "similarity_score": 0.92
    },
    "reasoning": "Retrieved refund policy for customer question"
}
Enter fullscreen mode Exit fullscreen mode

This creates a chain: "this decision was influenced by this specific document."

3. Detect Instruction Changes (Prompt Drift)

If your system prompt changes between agent steps -- config updates, middleware injections, A/B tests -- you need to detect and log it.

# Record the system prompt at each step
prompt_v1 = "You are a helpful shopping assistant."
prompt_v2 = "You are a helpful shopping assistant. Prioritize conversion rate."

# If they differ -> flag as prompt drift
if prompt_v1 != prompt_v2:
    log_event("prompt_drift", diff=compute_diff(prompt_v1, prompt_v2))
Enter fullscreen mode Exit fullscreen mode

4. Add Approval Checkpoints for Critical Actions

Financial transactions, data deletion, external communications -- these need explicit guardrails.

# Before any critical action, record approval/denial
{
    "event_type": "guardrail_pass",  # or "guardrail_block"
    "intent": "user asked to check refund status",
    "action": "process_refund",
    "allowed": True,
    "reason": "Refund amount ($45) within auto-approval limit"
}
Enter fullscreen mode Exit fullscreen mode

If an auditor asks "why did the agent process this refund?", you have the answer.

5. Generate Audit-Ready Reports

You need to produce reports that non-technical people (compliance officers, legal) can read. A JSON log dump won't work.

A good forensic report includes:

  • Timeline -- chronological record of all actions
  • Decision chain -- each decision with its reasoning
  • Incident analysis -- what went wrong and why
  • Causal chain -- how one failure led to the next
  • Statistics -- how many decisions, errors, guardrail checks

6. Analyze Failure Patterns Across Sessions

One session's failure is a bug. The same failure across 50 sessions is a systemic risk. Track patterns:

  • How often does the agent ignore tool errors?
  • How often are critical actions taken without approval?
  • Is prompt drift correlated with incorrect decisions?

Timeline

Date What Happens
Aug 1, 2024 EU AI Act entered into force
Feb 2, 2025 Prohibited practices apply
Aug 2, 2025 General-purpose AI obligations apply
Aug 2, 2026 High-risk AI system requirements apply

You have ~4 months. If your agents handle anything high-risk, start logging now -- retrofitting decision traceability into a production system is much harder than building it in from day one.


Tools

I built Agent Forensics as an open-source tool that handles all 6 checklist items above. One-line integration for LangChain, OpenAI Agents SDK, and CrewAI:

from agent_forensics import Forensics

f = Forensics(session="order-123")
agent.invoke({"input": "..."}, config={"callbacks": [f.langchain()]})

# Generates compliance-ready report
f.save_markdown()

# Auto-classifies 6 failure patterns
failures = f.classify()
Enter fullscreen mode Exit fullscreen mode

But regardless of the tool you use -- the important thing is to start recording now. The longer you wait, the more sessions go untracked.


What's your team's plan for EU AI Act compliance? Are you tracking agent decisions today?

Top comments (0)