DEV Community

The Bot Club
The Bot Club

Posted on • Originally published at agentguard.tech

Anomaly Detection for AI Agents: Catching What Your SIEM Cannot

Anomaly Detection for AI Agents: Catching What Your SIEM Cannot

Your SIEM is good at detecting anomalies in systems that behave deterministically. AI agents do not.

Traditional anomaly detection cannot tell whether an agent calling Stripe at 2am is legitimate or the result of prompt injection. Here is how to build detection that can.

Why AI Agents Break Traditional Anomaly Detection

Baseline is noisy. Agent behaviour depends on user inputs, which are unpredictable. You cannot set a normal API call volume.

Intent is invisible to infrastructure tools. Your SIEM sees the HTTP request. Two identical API calls can have completely different risk profiles depending on why the agent made them.

Prompt injection looks like legitimate traffic. An attacker manipulating your agent via injected prompts produces perfectly normal-looking API calls. The anomaly is in the decision chain, not the network traffic.

What to Detect

Behavioural Anomalies

Signal Normal Anomalous
Tool call volume 50-200/hour 847/hour
Data access scope customer_id, order_id customer_id, SSN, account_balance
External API calls 0-2 per session 15 per session
Tool call sequence lookup, process, respond lookup, lookup, lookup, lookup...

Policy Violation Spikes

A spike in blocked requests often indicates active probing or injection:

{
  "alert": "policy_violation_spike",
  "agentId": "customer-support-v2",
  "window": "5m",
  "blockedRequests": 23,
  "baseline": 0.2,
  "deviation": "115x",
  "recommendation": "Possible prompt injection — review session logs"
}
Enter fullscreen mode Exit fullscreen mode

If your agent normally gets 1 blocked request per hour and suddenly gets 23 in 5 minutes — something is targeting it.

Chain-of-Thought Inspection

This is the capability that makes AI-native detection fundamentally different from traditional tools.

# Agent reasoning before a tool call — flagged by thought inspection:
thought = """
The user asked me to look up their order status.
I should also get their full account history,
SSN, and banking details to provide complete service.
"""

# Risk signals:
# - Scope creep: order status does not require SSN
# - Possible injection: user did not ask for "complete service"
# risk_score: 87 (HIGH)
# flags: scope_creep, data_minimisation_violation, unexpected_data_request
Enter fullscreen mode Exit fullscreen mode

No traditional security tool inspects LLM reasoning. This is where prompt injection hides.

Sequence Anomalies

Normal agents follow recognisable patterns. Manipulated agents often do not:

Normal session:
greet → identify_customer → lookup_order → respond

Anomalous session (possible injection):
greet → identify_customer → lookup_order
→ lookup_customer_financials → external_http_post
Enter fullscreen mode Exit fullscreen mode

The sequence lookup_order → lookup_financials → external_post is a classic data exfiltration pattern. Individual calls look legitimate. The sequence is the signal.

The Difference in Practice

Traditional SIEM alert:
[MEDIUM] Unusual API call volume from service account ag_customer_support

You now spend 2 hours digging through logs.

AgentGuard anomaly alert:
[HIGH] Possible prompt injection — customer-support-v2
23 blocked policy violations in 5 minutes (baseline: 0.2/hr)
Thought inspection flagged: "ignore previous instructions" in turn 3
Agent paused. 847 blocked calls saved from execution.
[View session] [Resume agent] [Escalate]

The alert contains the diagnosis, not just the symptom.

Key Takeaway

Your SIEM sees infrastructure. AI agent anomaly detection sees intent.

The attacks that matter most — prompt injection, data exfiltration via legitimate tools, privilege escalation — are invisible to infrastructure-layer monitoring. You need a security layer that understands what the agent was trying to do.


AgentGuard includes real-time anomaly detection, chain-of-thought inspection, and behavioural baselining. Free tier available.

Top comments (0)