DEV Community

Cover image for StuckLoopDetection: How We Stopped an Agent Burning $12 on 47 Identical Calls
Kacper Włodarczyk
Kacper Włodarczyk

Posted on • Originally published at Medium

StuckLoopDetection: How We Stopped an Agent Burning $12 on 47 Identical Calls

TL;DR: Most agent loops aren't model failures — they're mechanical repetitions that the model itself doesn't recognize. pydantic-deep v0.3.8 introduces StuckLoopDetection, a capability that catches three loop patterns before they waste tokens.

This is post 1/3 in the "Self-Aware Agents" series. Overview of all 5 releases here.


Here's the incident that made this necessary.

A coding agent was working on a refactor task overnight. It hit a file with an unusual import pattern, couldn't parse the result, and defaulted to reading the file again.

By morning: 47 calls to read_file on the same path. $12 in API costs. Zero progress.

The model wasn't broken. Each call looked locally reasonable. From outside: it was stuck.

Why Prompting Isn't Enough

"Don't repeat tool calls" in a system prompt works sometimes. The problems:

  1. The model often doesn't recognize loops as loops — each repeated call looks locally justified
  2. Prompt compliance degrades under cognitive load (long tasks, many tools, complex context)
  3. You have to add the instruction to every agent separately

Detection at the capability level fixes all three.

The Three Loop Patterns

Pattern 1: Repeated Identical Calls

Turn 12: read_file(path="src/config.json")   {"imports": [...], "unknown_field": ...}
Turn 13: read_file(path="src/config.json")   {"imports": [...], "unknown_field": ...}
Turn 14: read_file(path="src/config.json")   same result
Enter fullscreen mode Exit fullscreen mode

Agent can't process the result, has no fallback, tries again. Default threshold: 3 calls.

Pattern 2: A-B-A-B Alternating

Turn 8:  list_directory(path="src/")
Turn 9:  read_file(path="src/main.py")
Turn 10: list_directory(path="src/")
Turn 11: read_file(path="src/main.py")
Enter fullscreen mode Exit fullscreen mode

Tool A suggests Tool B, Tool B suggests Tool A. Looks like progress — it's not.

Pattern 3: No-Op Loops

Same call, same result, keeps going. Common with writes, status checks, verification calls.

The Implementation

from pydantic_deep import create_deep_agent
from pydantic_deep.capabilities import StuckLoopDetection

# Default: enabled with threshold=3
agent = create_deep_agent(
    model="anthropic:claude-opus-4-6",
    stuck_loop_detection=True,
)

# Custom config
agent = create_deep_agent(
    model="anthropic:claude-opus-4-6",
    capabilities=[
        StuckLoopDetection(
            max_repeated=5,
            action="warn",   # "warn" = ModelRetry, "error" = StuckLoopError
        )
    ],
)
Enter fullscreen mode Exit fullscreen mode

action="warn" (default)

Triggers ModelRetry. The model gets a message:

You have called read_file(path="src/config.json") 3 times with identical arguments
and received the same result. This indicates a stuck loop. Try a different approach.
Enter fullscreen mode Exit fullscreen mode

Most of the time, the model pivots. If it doesn't — the threshold triggers again.

action="error"

Raises StuckLoopError. Clean failure for automated pipelines.

from pydantic_deep.capabilities import StuckLoopDetection, StuckLoopError

try:
    result = await agent.run("refactor the imports in src/")
except StuckLoopError as e:
    print(f"Agent got stuck: {e.pattern} pattern detected")
Enter fullscreen mode Exit fullscreen mode

Per-Run Isolation

Parallel agent.run() calls don't share stuck-detection state. Each run is isolated via for_run() — no leaked state between concurrent tasks.

# Safe to run concurrently with a shared agent instance
results = await asyncio.gather(
    agent.run("analyze src/module_a.py"),
    agent.run("analyze src/module_b.py"),
)
Enter fullscreen mode Exit fullscreen mode

The Business Case

A 47-call loop at Claude Opus pricing: ~$12. Same task with detection: ~$0.50 + one ModelRetry.

Cost of stuck_loop_detection=True: zero API calls, negligible latency, enabled by default.

Even false positives are cheap: one ModelRetry message, then the model tries a different approach.


Tomorrow: LimitWarnerCapability — teaching agents to know their context window is almost full.

Get Started

curl -fsSL https://raw.githubusercontent.com/vstorm-co/pydantic-deep/main/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

Top comments (0)