Stephen Souza

Posted on May 15

If your AI agent looped 40 times last night, would you know?

#agents #monitoring #devops #webdev

Probably not. And that's the problem.

Your process was running. Tool calls were firing. No exceptions thrown. From every angle your stack can see — healthy agent doing legitimate work.

What was actually happening: the agent called web_search with "latest AI news". Got results. Called it again with "recent AI developments". Got similar results. Called it again. And again. 47 tool calls. Zero useful output. $4.80 in tokens. Nobody knew until the OpenAI invoice arrived.

This is the failure mode conventional monitoring was never built to catch.

Why your current monitoring misses it

Your uptime monitor sees: process running.
Your error monitor sees: no exceptions thrown.
Your log monitor sees: tool calls completing successfully.
Your OpenAI dashboard sees: token consumption — but no per-task breakdown.

Nothing in that stack identifies that the agent is calling the same tool repeatedly with no progress toward completion. A looping agent looks identical to a healthy agent doing legitimate multi-step research.

The signal isn't an error. It's a pattern — same tool, repeated calls, no convergence toward a completion event. Catching that requires monitoring that understands what the agent is supposed to be doing, not just whether it's technically running.

What actually causes loops

Ambiguous task objective — the agent can't determine if it has "enough" to complete the task. It keeps searching for a confidence threshold it can never reach.

Tool output that looks like progress but isn't — each result genuinely looks like partial progress. The agent never recognises the cycle.

Malformed tool responses — the agent retries with slightly different parameters, gets the same malformed response, retries again.

Context window pressure — as context fills with results, the agent loses track of what it already tried and starts repeating earlier tool calls.

max_iterations as the only safeguard — stops the loop but doesn't alert you, doesn't tell you how many times the agent looped, and doesn't fire until the budget is already spent. Circuit breaker, not a monitor.

Three loop signatures to watch

Signature 1 — Same tool, repeated calls

web_search → web_search → web_search ← loop detected

Same tool appears in 3 of the last 5 calls without a completion event.

Signature 2 — High iteration count, no completion

Tool calls: 18
Normal completion at: 4–6 tool calls
Status: no completion event
→ Loop detected

Signature 3 — Token budget anomaly

Tokens consumed: 42,000
Normal consumption: 3,000–6,000
Status: no completion event
→ Budget anomaly detected

Setting up loop detection with NotiLens

Install:

pip install notilens

npm install @notilens/notilens

The pattern is simple — call run.loop() on every agent iteration. NotiLens ML learns how many iterations your agent normally runs and alerts when a run is anomalously high with no run.complete() arriving. No threshold to configure. No detection logic to write.

Python

import notilens

nl  = notilens.init(name="ai-agent")
run = nl.task("research")
run.start()

for iteration in range(max_iterations):
    tool_name, tool_input = agent.decide_next_action()

    run.loop(f"[{iteration + 1}] Tool: {tool_name}")  # every iteration
    run.metric("tool_calls", 1)                        # accumulates

    result = execute_tool(tool_name, tool_input)

    if agent.is_done(result):
        break

run.metric("tokens", total_tokens_used)
run.complete("Task completed")

Node.js

import { NotiLens } from '@notilens/notilens';

const nl  = NotiLens.init('ai-agent');
const run = nl.task('research');
run.start();

for (let i = 0; i < maxIterations; i++) {
  const { toolName, toolInput } = agent.decideNextAction();

  run.loop(`[${i + 1}] Tool: ${toolName}`);  // every iteration
  run.metric('tool_calls', 1);               // accumulates

  const result = await executeTool(toolName, toolInput);

  if (agent.isDone(result)) break;
}

run.metric('tokens', totalTokensUsed);
run.complete('Task completed');

Token and cost tracking

Track token consumption per run — NotiLens uses this alongside iteration count to detect cost anomalies:

run.metric("tokens", tokens_used_this_call)
run.metric("cost_usd", round(tokens_used_this_call * 0.0000002, 6))

If a run consumes 5x more tokens than your baseline without completing, NotiLens flags it — even without a manually configured budget limit.

Agent stall detection

For agents pausing on slow tools or external APIs:

run.wait("Awaiting API response")
result = call_slow_external_api()
run.progress("API response received")

run.wait() is non-terminal — the run continues. NotiLens learns how long your agent normally spends between events and fires if the gap becomes anomalous.

What the alert looks like

✅ task.started      Research agent — task started
🔄 task.loop         [1] Tool: web_search
🔄 task.loop         [2] Tool: web_search
🔄 task.loop         [3] Tool: web_search
🔄 task.loop         [4] Tool: web_search
🔄 task.loop         [5] Tool: web_search
🔄 task.loop         [6] Tool: web_search
⚠️  Anomaly detected  Iteration count 6 — exceeds learned baseline (avg: 3.2)
                     No task.complete received
                     tool_calls: 6 | tokens: 8,420 | cost: $0.0017
→ Push notification fired

One line per iteration. NotiLens detected the pattern automatically.

Full agent monitoring checklist

run.loop() called on every agent iteration
run.start() fires when task begins
run.complete() fires on successful completion
run.fail() fires on any unhandled exception
run.error() fires on non-fatal tool errors
run.timeout() fires if agent exceeds SLA window
run.wait() fires when agent pauses on slow external call
run.metric("tool_calls", 1) accumulates per iteration
run.metric("tokens", ...) tracks token usage
run.metric("cost_usd", ...) tracks cost per run

The short answer

max_iterations stops the loop. NotiLens tells you it happened at call 6, not call 47 — while there's still something to investigate.

Works with LangChain, CrewAI, AutoGen, LlamaIndex, Pydantic AI, or any custom agent loop. No framework dependency.

notilens.com

DEV Community