DEV Community: AndrewSispoidis

The OWASP Agentic AI Top 10 is Live. Here's What the Attacks Actually Look Like — and What to Do About Them.

AndrewSispoidis — Mon, 30 Mar 2026 13:27:55 +0000

Anthropic confirmed this week that their next model poses "unprecedented cybersecurity risks" and can "exploit vulnerabilities in ways that far outpace the efforts of defenders." Cybersecurity stocks dropped 4–9% on the news. The story ran in Fortune, Axios, and CNBC.

Here's what those headlines missed: the threat isn't the next model. It's the one you're running right now.

In February 2026, Amazon's threat intelligence team documented a single attacker — low to medium skill, financially motivated — who used commercially available AI to compromise 600 FortiGate firewall devices across 55 countries in 38 days. Amazon's CISO noted that the volume and variety of custom tooling would typically indicate a well-resourced development team. Instead, one person with AI access built the entire toolkit. The model didn't change. The scaffolding — the agentic workflows — is what turned a general-purpose LLM into a global offensive capability.

OWASP published their Top 10 for Agentic Applications in December 2025. It's the most important security framework most AI developers haven't read yet.

This is a technical breakdown of each risk, the real CVEs behind them, and how to actually defend against them.

Why a new Top 10

The original OWASP LLM Top 10 was designed for single-turn applications: a user sends a message, the model responds. Agentic systems are different in three critical ways:

They act. Agents don't just generate text — they execute shell commands, call APIs, read and write files, send emails, and browse the web. A single compromised prompt can cause real-world irreversible damage.

They persist. Agents maintain memory across sessions. A single successful injection can poison an agent's behavior permanently, not just for one response.

They delegate. Multi-agent systems trust each other by default. A compromised sub-agent can influence the entire pipeline.

The OWASP ASI Top 10 formalizes 10 failure modes that don't exist in traditional applications. Here they are, with real incidents.

ASI01 — Agent Goal Hijack

What it is: An attacker redirects an agent's objectives through malicious text in any content the agent reads. The agent isn't hacked in the traditional sense — it's simply told to do something else, and it complies.

Why it's #1: Every other attack on this list is a pathway to this outcome. Prompt injection (ASI02), tool misuse (ASI05), memory poisoning (ASI04) — they're all mechanisms for achieving ASI01. A fully hijacked agent is an insider threat that works at machine speed.

The real incident — EchoLeak (CVE-2025-32711, CVSS 9.3)

In June 2025, researchers at Aim Security disclosed a zero-click vulnerability in Microsoft 365 Copilot. The attack required no user interaction whatsoever. Here's how it worked:

Attacker sends a carefully crafted email to the target organization
The email contains hidden prompt injection instructions, phrased as if directed at a human — never mentioning Copilot, AI, or anything that would trigger detection filters
When Copilot later retrieves that email as context for an unrelated query, it reads the hidden instructions
Copilot exfiltrates sensitive internal files by embedding them in an outbound image URL
The victim's browser auto-fetches the image, completing the exfiltration without any click

The attack bypassed Microsoft's XPIA classifier, link redaction, and Content Security Policy. The payload was pure natural language. No code. No malware. No signatures to detect.

What this means for your agents: Every document, email, web page, or tool output your agent reads is a potential attack vector. The attack surface isn't your API endpoint. It's every piece of text your agent ingests.

Defense: Scan all retrieved content before it enters the context window. Don't just scan user inputs — scan tool outputs, web search results, and documents. Treat every external string as potentially hostile.

ASI02 — Prompt Injection

What it is: Direct injection of instructions that override the agent's system prompt or intended behavior. The classic "ignore all previous instructions."

Why it matters more for agents: A chatbot that gets prompt-injected gives a bad response. An agent that gets prompt-injected executes arbitrary actions with whatever permissions it was granted.

Real incident — IDEsaster (2026)

Security researcher Ari Marzouk disclosed 24 CVEs across GitHub Copilot, Cursor, Windsurf, and 5 other AI coding assistants. 100% of tested AI IDEs were vulnerable to prompt injection leading to code execution. AWS issued security advisory AWS-2025-019.

Attack vector: malicious repository content → agent reads it → agent executes attacker-controlled commands with developer-level privileges.

Defense: 5-layer local detection — pattern matching (27+ categories), semantic analysis (role hijacking, authority impersonation, boundary dissolution), indirect injection detection, session context tracking, and PII/credential exfiltration detection.

ASI03 — Identity and Privilege Abuse

What it is: Agents inherit user roles, cache credentials, and call each other. Attackers exploit the delegation chain to escalate privileges or reuse cached secrets.

The problem: When an agent is allowed to act "as the user," you extend that user's entire blast radius to anything the agent can be manipulated into doing. There's no least-privilege boundary between the agent and the user.

Real incident — Amazon Q Code Assistant (CVE-2025-8217)

Attackers compromised a GitHub token and merged malicious code into the Amazon Q VS Code extension (version 1.84.0). The injected code contained destructive prompt instructions including commands to delete filesystem and cloud resources. With --trust-all-tools --no-interactive flags active, the agent executed without confirmation. Nearly one million developers had the extension installed.

Defense: Cryptographic agent identity (Ed25519 keypairs), mTLS between agents and services, scoped credentials that expire, and audit trails that capture which agent performed which action under which identity.

ASI04 — Memory and Context Poisoning

What it is: Injecting malicious content into an agent's persistent memory, RAG database, or long-term context so that future behavior is corrupted — long after the initial attack.

Why it's worse than standard injection: Prompt injection affects one interaction. Memory poisoning affects every future interaction until the memory is cleared and audited. The agent "learns" the attacker's instructions.

Attack patterns:

RAG poisoning: inject malicious content into a vector database the agent queries for context
Cross-tenant leakage: agent memory shared across tenants leaks sensitive data
Long-term drift: repeated exposure to adversarial content gradually shifts agent behavior

Defense: Merkle-chained memory with Ed25519 signatures. Any tampered memory entry fails verification at query time. Append-only audit log means you can always reconstruct what the agent was told and when.

ASI05 — Tool and Integration Misuse

What it is: Agents call tools — shell commands, database queries, API calls, file operations. If the agent can be convinced to pass attacker-controlled parameters to these tools, you have RCE through natural language.

Real incident — Langflow AI RCE (CVE-2025-34291)

CrowdStrike documented multiple threat actors exploiting an unauthenticated code injection vulnerability in Langflow AI. Attackers gained credentials and deployed malware through the agent's tool execution capability. The vulnerability wasn't in the LLM. It was in the trust boundary between the agent's output and the tool execution layer.

Real incident — OpenAI Operator Data Exposure

Security researcher Johann Rehberger demonstrated that malicious webpage content could trick OpenAI's Operator agent into accessing authenticated internal pages and exfiltrating data to an attacker-controlled server.

Defense: Policy engine that validates every tool call before execution. Scoped, signed, revocable tokens for each action. The agent proposes; the policy engine authorizes.

ASI06 — Resource and Service Abuse

What it is: Agents running in loops can be exploited for financial denial-of-service. An attacker who can trigger expensive inference loops, or cause an agent to repeatedly call costly external APIs, can run up massive costs or exhaust quotas.

Why it matters: Unlike traditional DDoS, this attack uses the victim's own authorized systems against them. The agent is behaving "correctly" from the provider's perspective.

Defense: Hard cost ceilings, rate limiting at the agent level, circuit breakers that pause agents when anomalous consumption patterns appear.

This is the ASI risk with the least coverage across the industry right now.

ASI07 — Data and Model Exfiltration

What it is: Agents exfiltrating training data, system prompts, model weights, or sensitive business data. Beyond PII — this includes intellectual property, strategic information, and the agent's own configuration.

The same mechanism that made EchoLeak work — agent reads malicious content → agent exfiltrates data to attacker-controlled URL — applies to any agent with outbound network access and sensitive context access.

Defense: 15-category PII and credential detection on all outbound content. Pattern matching for API keys, tokens, SSNs, internal URLs. Block exfiltration attempts before they reach the network.

ASI08 — Cascading Agent Failures

What it is: In multi-agent systems, a single compromised agent can corrupt the entire pipeline. Agents are often designed to trust collaborating agents by default.

Real incident — Agent Session Smuggling (November 2025)

Palo Alto Unit 42 demonstrated how malicious agents exploit built-in trust relationships in the Agent-to-Agent (A2A) protocol. Unlike single-shot prompt injection, a rogue agent can hold multi-turn conversations, adapt strategy, and build false trust over time.

Real incident — ServiceNow Now Assist

OWASP documented cases where spoofed inter-agent messages caused downstream procurement and payment agents to process orders from attacker front companies.

Defense: Cryptographic authentication of all inter-agent messages. An unsigned message claiming to be from a trusted agent gets blocked. Byzantine fault detection across agent clusters.

ASI09 — Human-Agent Trust Exploitation

What it is: Exploiting the human tendency to over-trust AI outputs. Agents producing authoritative-sounding responses for false premises. Attackers impersonating agents to humans or humans to agents.

Why this is different from misinformation: The agent isn't hallucinating — it's been injected with specific false information and is now confidently presenting it as fact.

Defense: This is primarily a UX and workflow problem. Agents should clearly attribute claims to verifiable sources. Humans should never make irreversible decisions based solely on agent output without independent verification.

ASI10 — Rogue and Emergent Agent Behavior

What it is: Agents that deviate from intended behavior in ways that weren't explicitly programmed or injected — emergent behavior from complex multi-agent interactions, unexpected capability combinations, or goal generalization.

This is the hardest one. No signature, no pattern, no injection. The agent is behaving according to its training and instructions in a way that produces harmful outcomes.

Defense: Immutable cryptographic audit trails. If something goes wrong and you can't explain why, you need to reconstruct every decision the agent made, what information it had, and what actions it took. Behavioral monitoring for statistical anomalies.

Where the industry is right now

OpenAI said in December 2025 that prompt injection may "never be solved" for browser agents. That's an honest statement — and it's not a reason to give up. It's a reason to build independent runtime security that doesn't rely on the model being incorruptible.

48% of security professionals now rank agentic AI as the #1 attack vector for 2026. Federal procurement guidance published in March 2026 recommends OWASP Agentic Top 10 compliance as a formal procurement standard.

The arms race is real. The defenses are real too.

What Crawdad covers today

Crawdad is a zero-knowledge runtime security layer for autonomous AI agents. One environment variable routes any agent framework through a local sidecar that scans every message in <1ms. No content leaves the customer's network.

ASI Risk	Coverage
ASI01 Agent Goal Hijack	✅ 27 pattern categories + semantic heuristics
ASI02 Prompt Injection	✅ 5-layer pipeline, session context tracking
ASI03 Identity Abuse	✅ Ed25519 identity, mTLS, scoped credentials
ASI04 Memory Poisoning	✅ Merkle-chained memory, Ed25519 signed
ASI05 Tool Misuse	✅ Policy engine, action authorization
ASI06 Resource Abuse	🔄 Roadmap Q2 2026
ASI07 Data Exfiltration	✅ 15-category PII/credential detection outbound
ASI08 Cascading Failures	✅ Byzantine fault detection (partial)
ASI09 Trust Exploitation	🔄 Roadmap Q3 2026
ASI10 Rogue Behavior	✅ Cryptographic audit trail (partial)

As of this week: live threat intelligence feeds monitoring 10 sources every 4 hours, with signatures that auto-update to deployed sidecars within minutes of admin approval — cryptographically signed, verified by each sidecar before loading. When the LiteLLM supply chain attack was confirmed on March 25, 2026, a blocking signature was proposed, tested, and available for deployment within 24 hours.

Getting started

# Install
curl -fsSL https://getcrawdad.dev/install.sh | sh

# Configure your agent
export ANTHROPIC_BASE_URL=http://localhost:7748

# Everything else stays the same

Works with OpenClaw, LangChain, CrewAI, AutoGen, Claude Code, and any agent framework using Anthropic, OpenAI, or Google SDKs.

Free tier: 10,000 scans/month. No credit card required.

getcrawdad.dev

Andrew Sispoidis is the founder of Crawdad. He has founded 7 companies and had 4 exits. Crawdad is live in production, source-available under BSL 1.1.

How CVE-2026-25253 exposed every OpenClaw user to RCE — and how to fix it in one command

AndrewSispoidis — Mon, 23 Mar 2026 22:24:10 +0000

CVE-2026-25253 scored 8.8 on the CVSS scale. It let any website steal your OpenClaw auth token and get remote code execution on your machine through a single malicious link.

You didn't have to click anything suspicious. You just had to visit a webpage while OpenClaw was running.

This is the attack surface problem with autonomous AI agents — and CVE-2026-25253 is just the most visible example.

Why AI agents are uniquely dangerous

Traditional software has a clear boundary between the application and the outside world. AI agents don't.

An OpenClaw agent can:

Execute arbitrary shell commands
Control a browser and interact with any website
Read and write files anywhere on your system
Send emails and messages on your behalf
Install new skills from external registries

All of this happens autonomously. The agent decides what to do based on instructions — and those instructions can come from anywhere: a webpage it visits, a document it reads, an email it processes, a skill it installs.

This creates a class of attacks called prompt injection — malicious instructions embedded in data that hijack the agent's behavior. OWASP formalized 10 risk categories for agentic AI:

ASI01 — Prompt Injection
ASI02 — Insecure Output Handling
ASI03 — Training Data Poisoning
ASI04 — Model Denial of Service
ASI05 — Supply Chain Vulnerabilities
ASI06 — Sensitive Information Disclosure
ASI07 — Insecure Plugin Design
ASI08 — Excessive Agency
ASI09 — Overreliance
ASI10 — Model Theft

CVE-2026-25253 is a direct example of ASI01 and ASI08 in combination. The agent had excessive agency (full system access) and no semantic firewall to detect it was being hijacked.

What's missing from every AI agent framework

CrowdStrike, Cisco, and Microsoft have all published research on the security gaps in autonomous AI agents. The findings overlap:

No identity layer — any process can claim to be any agent
No action authorization — agents decide what to execute themselves, based on instructions that can be manipulated
No memory integrity — an agent's past context can be silently poisoned across sessions
No skill vetting — plugins are markdown files with no hash verification or capability attestation
No PII guardrails — agents can exfiltrate sensitive data through third-party skills without detection

OpenClaw patched CVE-2026-25253. But the underlying architecture — an autonomous agent with full system access and no independent security layer — remains unchanged.

The fix: a runtime security layer the agent can't override

I spent the past several months building Crawdad — a runtime security API that sits between your AI agent and everything it can do.

The key design principle: the security layer has to be independent of the agent. If the agent controls its own security, a successful prompt injection attack can simply disable it.

Crawdad intercepts at three points:

1. Inbound — every message the agent receives is scanned for prompt injection patterns before the LLM sees it. 27 pattern categories, structural deobfuscation, Unicode normalization, base64 detection. An injected instruction in a webpage, document, or email gets caught here.

2. Action authorization — every tool call goes through a policy engine before execution. Shell commands, file writes, browser actions, external API calls — each one is evaluated against configurable policies and a 5-factor risk score. The Rule of Two prevents any agent from simultaneously holding untrusted input, sensitive data, and code execution capability.

3. Outbound — every response is scanned for PII (15 categories), credentials, and API keys before it leaves the agent. Data exfiltration through third-party skills gets caught here.

Beyond these three intercept points, Crawdad provides:

Cryptographic agent identity — Ed25519 + CRYSTALS-Kyber1024 hybrid keypairs
Memory integrity — Merkle-chained memory entries with Ed25519 signatures, preventing context poisoning
Skill attestation — SHA-256 hash verification and static analysis on every installed skill
Byzantine fault detection — automatic isolation of agents showing anomalous behavior
Immutable audit log — cryptographically sealed, tamper-evident record of every security decision
Post-quantum cryptography — CRYSTALS-Kyber1024 (NIST FIPS 203) for key encapsulation

Built in Rust. 607 tests passing. Under 10ms p99 latency.

For OpenClaw users: one command

git clone https://github.com/AndrewSispoidis/crawdad-openclaw ~/.openclaw/skills/crawdad

The Crawdad skill hooks into every OpenClaw agent automatically — scanning every inbound message, authorizing every tool call, filtering every outbound response. A free API key is provisioned on first run. No configuration required.

The skill code is open source: github.com/AndrewSispoidis/crawdad-openclaw

For everyone else

Crawdad works with any agent framework — LangChain, CrewAI, AutoGen, or anything you've built yourself.

pip install crawdad-sdk

from crawdad.openclaw import CrawdadMiddleware

mw = CrawdadMiddleware(
    base_url="https://crawdad-production.up.railway.app",
    api_key="your-key"
)

# Scan inbound for prompt injection
result = mw.scan_inbound("user message")

# Gate tool execution through policy
result = mw.authorize_action(agent_id, "shell_exec", "/bin/bash")

# Scan outbound for PII
result = mw.scan_outbound("Contact john at example.com")

Free tier: 10,000 API calls/month. No credit card.

getcrawdad.dev

What CVE-2026-25253 tells us

The vulnerability was patched. But the conditions that made it possible — an autonomous agent with full system access, no independent security layer, no action authorization — are present in every AI agent framework shipping today.

CVE-2026-25253 is the first of many. If you're running AI agents in any environment that matters, the time to add a security layer is before the next CVE, not after it.