Gartner's prediction isn't about bad models. It's about a missing infrastructure layer.
Gartner predicts that more than 40% of agentic AI projects will be cancelled by 2027. The reason given: cost, weak ROI, and poor governance.
Most teams hear "poor governance" and reach for a framework. They write policies. They set up review committees. They document acceptable outputs. They file compliance reports.
None of this prevents the failure mode that's actually killing projects.
The Failure Mode Nobody Talks About
Here's what a typical agentic AI failure looks like in production:
The pipeline runs. The output looks reasonable. No exceptions are thrown. No alerts fire.
Three steps earlier, an agent received state that didn't meet the conditions for that step. It ran anyway. It produced output that looked plausible but was built on invalid foundations. Two steps later, another agent acted on that output. By the time the final result surfaces, the error is untraceable.
This isn't a governance failure in the policy sense. The policies existed. The model was aligned. The guardrails were in place.
It's an enforcement failure. Nobody checked whether the state was valid before the agent got control.
What Governance Frameworks Actually Cover
Most AI governance frameworks operate at two levels:
Design-time governance: policies about what the system is allowed to do, data handling requirements, human oversight requirements, documentation standards.
Output-time governance: guardrails on what the model returns, content filtering, output validation.
Both are necessary. Neither is sufficient.
There's a third level that frameworks consistently miss: runtime enforcement at stage boundaries — checking whether the state entering each agent step is valid, and whether the action the agent is about to take is permitted given current runtime context.
Design-time governance says: "agents should not process data from restricted jurisdictions."
Output-time governance says: "flag outputs that reference restricted jurisdictions."
Runtime enforcement says: "before this agent runs, verify that jurisdiction is not restricted. If it is — block the action, preserve state, write an audit entry."
The first two are policy. The third is enforcement. Policy without enforcement is documentation.
Why Projects Get Cancelled
The 40% cancellation rate isn't happening because organizations lack policies. Most organizations that reach the agentic AI stage have governance policies in place.
Projects get cancelled for three concrete reasons:
- Failures are invisible until they're catastrophic. Agent pipelines fail silently. Shared mutable state passes through multiple LLM calls, and each call can degrade that state in ways that look like normal output. By the time the failure is visible, it has propagated through the system. Reconstruction is expensive or impossible.
- Replay without idempotency is dangerous. When a pipeline fails mid-run, teams face a choice: restart from the beginning and risk re-executing side effects (duplicate API calls, double charges, repeated writes), or investigate manually. Neither is acceptable at scale.
- Audit trails don't prove enforcement. Regulators and compliance teams increasingly ask not just "what did the system do" but "can you prove the system was constrained before it acted." Logging outputs answers the first question. It doesn't answer the second.
The Missing Layer
The infrastructure that would prevent these failures exists in traditional software. It just hasn't been applied to agent pipelines.
Pre-conditions: before an agent step runs, verify that the state meets required conditions. If it doesn't — reject the step, preserve state, write a structured failure event with full context.
Policy gates: before the LLM call, evaluate whether the action is permitted given current runtime context. Not after the output — before the call. If jurisdiction is restricted, the model never runs. No tokens consumed. No side effects.
Checkpoints: after each stage completes, write the state to a checkpoint. If the pipeline fails mid-run, replay from the last checkpoint. Steps already completed are skipped via idempotency keys — no double execution.
Structured audit trail: not just logs of what the model returned, but records of what conditions were true when each step ran, which policies evaluated, and whether they passed or failed.
What This Looks Like in Practice
A pipeline with enforcement built in behaves differently when something goes wrong:
[intake] ✓ pre: observation_present
[intake] ✓ post: normalized
[taxonomy] ✓ pre: normalized
[taxonomy] ✕ post: species_identified → ContractViolation
→ state preserved at failure point
→ DLQ entry written
stage: taxonomy
predicate: species_identified
context: { ...full state snapshot... }
→ replay available from: taxonomy
The failure is caught at the exact point it occurs. The state is preserved. The audit entry contains everything needed for diagnosis. Replay from the failure point doesn't re-execute completed steps.
Compare this to the standard failure mode: exception in production, stack trace in logs, restart from the beginning, hope the side effects don't cause problems.
Governance Frameworks Are Necessary But Not Sufficient
This isn't an argument against governance frameworks. EU AI Act compliance, internal audit requirements, responsible AI policies — all of these matter and all of them are necessary.
The argument is that governance frameworks operate at the wrong layer to prevent the failure mode that's causing the 40% cancellation rate.
Policy documents don't run before agent steps. Review committees don't evaluate runtime state. Compliance reports don't prevent invalid state from entering a pipeline.
The teams that will not be in the 40% aren't the ones with the most comprehensive governance policies. They're the ones that built enforcement into the pipeline itself — pre-conditions, policy gates, checkpoints, structured audit trails — as first-class primitives, not afterthoughts.
Governance tells you what's allowed.
Enforcement ensures only what's allowed actually happens.
The gap between those two things is where the 40% lives.
If you're building agentic AI pipelines and thinking about the enforcement layer, DEED is a runtime contract engine for Python agent pipelines: pre/post conditions, policy gates, checkpoint/replay. Zero dependencies. Python 3.10+.
github.com/Deadly-Reiter/deed · pip install deed-runtime
Top comments (0)