In 2024, a company called Delve generated SOC2 and ISO 27001 certifications for 494 companies. 493 of those reports contained 99.8% identical boilerplate. All 494 passed compliance checks.
The Delve case is the clearest argument I've seen for why the EU went with behavioral requirements in the AI Act, not declarative ones.
What Article 12 Actually Says
Article 12 of the EU AI Act (enforcement: August 2, 2026) requires high-risk AI systems to have automatic recording of events over the system's lifetime, tamper-evident logging the system cannot suppress or modify, a queryable audit trail for post-market monitoring, and minimum six-month retention.
The operative word is automatic. Not "the system should log." Not "logging must be enabled by policy." The logs must be generated independently of the AI system making the decisions.
Why System Prompts and Policy Files Don't Count
The most common compliance mistake I see: teams treating their system prompt as an audit trail.
"Our agent is instructed to log all actions" is a declaration of intended behavior, not evidence of actual behavior. The agent can hallucinate, be prompted away from this instruction, or simply fail to log edge cases.
The Delve problem applied to AI: an agent that controls its own audit trail can, by definition, produce a compliant-looking log of non-compliant behavior. Article 12 closes this loophole architecturally. The logging layer must sit outside the agent's control boundary.
What "Architectural Independence" Means in Practice
Compliant logging needs to intercept tool calls, LLM completions, and decisions at the middleware level, before the agent processes them. Signatures need to come from infrastructure the agent doesn't touch. Hash-chaining makes retroactive modification detectable. And the AI system should not be able to distinguish "logging on" from "logging off."
This is an architectural constraint, not a logging configuration.
The Clock Is Running
Building Ed25519 signing, hash-chaining, retention enforcement, and a queryable audit API from scratch takes 6 to 8 weeks minimum. Conformity assessments for high-risk systems add additional time.
The August 2 deadline is about 95 days away. A proposed deferral to December 2027 hasn't passed. Planning for the current deadline is the only defensible position.
A full breakdown of Article 12's technical requirements and compliance mapping is here.
If you're building or deploying high-risk AI systems in the EU (creditworthiness assessment, employment screening, biometric identification, critical infrastructure), August 2 is not a soft deadline.
Top comments (0)