In December 2025, OWASP published the Top 10 for Agentic Applications. Item eight, ASI08, covers cascading failures in agentic AI systems. If you run multiple AI agents that interact with each other, this applies to you.
What ASI08 Actually Says
The formal definition: "A cascading failure in agentic AI occurs when a single fault - whether a hallucination, malicious input, corrupted tool, or poisoned memory - propagates across autonomous agents and compounds into system-wide harm."
Three propagation mechanisms make agent cascades different from traditional software failures:
Feedback loops. Two agents whose outputs become each other's inputs create self-reinforcing cycles. A small error in Agent A's output gets amplified by Agent B, which feeds it back to Agent A, which amplifies it further. Traditional circuit breakers don't catch this because each individual request looks normal.
Transitive trust chains. Agent B trusts Agent A's output. Agent C trusts Agent B's output. If Agent A gets compromised or hallucinates, the bad data flows through the entire chain. No agent independently verifies against ground truth because they trust what comes from upstream.
Persistent memory poisoning. An error written to a vector store, knowledge base, or long-term memory continues influencing agent reasoning in future runs. The failure isn't a one-time event. It becomes a persistent corruption that survives restarts.
The OWASP expert review board (100+ contributors including NIST and European Commission representatives) recommends a defense-in-depth approach with three layers:
- Architectural isolation with trust boundaries and circuit breakers
- Runtime verification with multi-agent consensus and ground truth validation
- Comprehensive observability with automated cascade pattern detection and kill switches
That third point is where most teams have a gap.
The Observability Gap
Your agents probably have logging. They might have LLM tracing through Langfuse or LangSmith. But when Agent C errors at 14:32, can you answer:
- Which upstream agent's output caused it?
- What was the exact payload at each hop in the chain?
- Did the error originate from a tool call failure, a hallucination, or a memory corruption?
- How many other agents are now operating on the same bad data?
If your answer is "we'd read through the logs and try to piece it together," you're doing forensics by hand. ASI08's mitigation specifically calls for automated cascade pattern detection. Not log analysis after the fact.
The Math Problem
Your 10-agent system has a reliability problem nobody talks about. If each agent is 95% reliable on its own (which is generous), the probability that all 10 succeed on a given run is 0.95^10 = 0.60.
Your system is 40% unreliable even when each component is 95% reliable.
And that assumes independent failures. Agent cascades create correlated failures where one agent's problem causes three others to break simultaneously. The actual reliability is lower than the math suggests.
What Compliance Looks Like
To meet ASI08's observability requirements, you need:
Heartbeat monitoring. Every agent reports its status at regular intervals. If an agent stops reporting, that's signal. If an agent reports degraded status, that's signal. A fleet-wide view shows you which agents are healthy, which are degraded, and which have gone silent.
Cross-agent event correlation. When a failure happens, trace backward through the event log. Agent C failed because it received bad input. That input came from Agent B. Agent B's output was corrupted because Agent A's tool call returned an error that B didn't handle. The full chain, with timestamps and payloads, stored automatically.
Alert deduplication. The same alert from the same agent every 5 minutes is noise. You need escalation tiers: the first occurrence is info, the fifth is a warning, the tenth is critical. Without dedup, your team develops alert fatigue, which is itself a cascade risk.
Forensic replay. For any failure, reconstruct what happened: "Agent A returned X. Agent B transformed it to Y. Agent C received Y and failed because Z." This is deterministic replay from stored evidence, not reconstruction from partial logs.
Building This
We built AgentWatch because we hit every one of these problems running our own 7-agent system. One agent would fail, three others would break, and we'd spend hours reading through separate log files trying to trace the chain.
It's a TypeScript library. npm install @nicofains1/agentwatch, wire in heartbeats and event reporting, and you get cascade detection, fleet dashboards, alert dedup, and forensic replay out of the box. Self-hosted with SQLite, no external services needed.
The approach is deterministic: store evidence at each hop, walk the trace DAG backward on failure. No ML model trying to guess what happened. The data tells you directly.
What to Do Right Now
If you're running multiple AI agents:
Audit your observability. Can you trace a failure from one agent back to its root cause in another? If not, you have an ASI08 gap.
Check your alert setup. Are you getting the same alert 19 times in an hour? That's a sign you need deduplication and escalation.
Test a cascade scenario. Intentionally break one agent's output and watch what happens to agents downstream. If you can't see the propagation path in your current tooling, that's the gap ASI08 is warning about.
The speed and scale of agentic AI outpaces human reaction times. By the time you notice a cascade manually, it's already propagated. Automated detection isn't a nice-to-have. According to OWASP, it's part of the mitigation framework.
The OWASP Agentic AI Top 10 is worth reading in full. ASI08 is item eight, but the other nine items interact with cascade risk in ways that compound the problem.
Top comments (0)