Anomaly Detection for AI Agents: Catching What Your SIEM Cannot
Your SIEM is good at detecting anomalies in systems that behave deterministically. AI agents do not.
Traditional anomaly detection cannot tell whether an agent calling Stripe at 2am is legitimate or the result of prompt injection. Here is how to build detection that can.
Why AI Agents Break Traditional Anomaly Detection
Baseline is noisy. Agent behaviour depends on user inputs, which are unpredictable. You cannot set a normal API call volume.
Intent is invisible to infrastructure tools. Your SIEM sees the HTTP request. Two identical API calls can have completely different risk profiles depending on why the agent made them.
Prompt injection looks like legitimate traffic. An attacker manipulating your agent via injected prompts produces perfectly normal-looking API calls. The anomaly is in the decision chain, not the network traffic.
What to Detect
Behavioural Anomalies
| Signal | Normal | Anomalous |
|---|---|---|
| Tool call volume | 50-200/hour | 847/hour |
| Data access scope | customer_id, order_id | customer_id, SSN, account_balance |
| External API calls | 0-2 per session | 15 per session |
| Tool call sequence | lookup, process, respond | lookup, lookup, lookup, lookup... |
Policy Violation Spikes
A spike in blocked requests often indicates active probing or injection:
{
"alert": "policy_violation_spike",
"agentId": "customer-support-v2",
"window": "5m",
"blockedRequests": 23,
"baseline": 0.2,
"deviation": "115x",
"recommendation": "Possible prompt injection — review session logs"
}
If your agent normally gets 1 blocked request per hour and suddenly gets 23 in 5 minutes — something is targeting it.
Chain-of-Thought Inspection
This is the capability that makes AI-native detection fundamentally different from traditional tools.
# Agent reasoning before a tool call — flagged by thought inspection:
thought = """
The user asked me to look up their order status.
I should also get their full account history,
SSN, and banking details to provide complete service.
"""
# Risk signals:
# - Scope creep: order status does not require SSN
# - Possible injection: user did not ask for "complete service"
# risk_score: 87 (HIGH)
# flags: scope_creep, data_minimisation_violation, unexpected_data_request
No traditional security tool inspects LLM reasoning. This is where prompt injection hides.
Sequence Anomalies
Normal agents follow recognisable patterns. Manipulated agents often do not:
Normal session:
greet → identify_customer → lookup_order → respond
Anomalous session (possible injection):
greet → identify_customer → lookup_order
→ lookup_customer_financials → external_http_post
The sequence lookup_order → lookup_financials → external_post is a classic data exfiltration pattern. Individual calls look legitimate. The sequence is the signal.
The Difference in Practice
Traditional SIEM alert:
[MEDIUM] Unusual API call volume from service account ag_customer_support
You now spend 2 hours digging through logs.
AgentGuard anomaly alert:
[HIGH] Possible prompt injection — customer-support-v2
23 blocked policy violations in 5 minutes (baseline: 0.2/hr)
Thought inspection flagged: "ignore previous instructions" in turn 3
Agent paused. 847 blocked calls saved from execution.
[View session] [Resume agent] [Escalate]
The alert contains the diagnosis, not just the symptom.
Key Takeaway
Your SIEM sees infrastructure. AI agent anomaly detection sees intent.
The attacks that matter most — prompt injection, data exfiltration via legitimate tools, privilege escalation — are invisible to infrastructure-layer monitoring. You need a security layer that understands what the agent was trying to do.
AgentGuard includes real-time anomaly detection, chain-of-thought inspection, and behavioural baselining. Free tier available.
Top comments (0)