AI agent reasoning loops occur when an agent calls the same tool repeatedly without making progress, convinced that one more attempt will produce the perfect answer. The agent wastes tokens, time, and money without delivering a result. This post shows how to detect and block repeated calls, validated with a demo where ambiguous tools caused 14 calls vs clear SUCCESS states that stopped in 2.
This demo uses Strands Agents. The patterns — debounce hooks, clear tool states, and call limits — are framework-agnostic and apply to any agent that supports lifecycle hooks, including LangGraph, AutoGen, and CrewAI.
Working code: github.com/aws-samples/sample-why-agents-fail
Series: Why AI Agents Fail
- Context Window Overflow — Memory Pointer Pattern for large data
- MCP Tools That Never Respond — Async pattern for slow external APIs
- AI Agent Reasoning Loops (this post) — Detect and block repeated tool calls
The Problem: Agents That Overthink
AI agent reasoning loops occur when an agent calls the same tool repeatedly without making progress, wasting tokens and time without delivering a result. AI agents don't just fail by giving wrong answers; they fail by never finishing. Research shows agents get trapped in reasoning loops where they call the same tool repeatedly, convinced that "one more step" will produce the perfect answer.
The Decoder (Jan 2025) found that even with unlimited computing power, overthinking leads to poor decisions. Incomplete understanding of the world causes compounding errors. Each additional reasoning step makes things worse, not better.
Particula (Jul 2025) (community observation) documented an extreme case: an agent executed 847 reasoning steps at $47 per minute and never delivered a final answer. It kept refining logic, questioning conclusions, and requesting more data in an endless cycle.
CodiesHub (Dec 2025) (community observation) identifies the root causes:
- Unclear goals — agent doesn't know when the task is complete
- Ambiguous tool feedback — tools don't return clear success/failure states
- No stopping criteria — no hard limits on iterations or time
Why Loops Happen: Ambiguous Tool Feedback
Ambiguous tool feedback occurs when tools return partial results or suggest "more data may be available" without clear terminal states, causing agents to retry the same call. Tools that return partial results or suggest "more data may be available" cause agents to retry:
@tool
def search_flights(origin: str, destination: str, max_price: float) -> str:
"""Search for flights under a max price."""
prices = [random.randint(200, 800) for _ in range(3)]
matching = [p for p in prices if p <= max_price]
# The problem: "More results may be available" signals the LLM to retry
# The agent interprets this as "I should search again to find a better deal"
return (
f"Found {len(matching)} flights under ${max_price} "
f"(out of {len(prices)} checked). "
"Note: More results may be available. Prices change frequently."
)
That "Note: More results may be available" triggers the loop. The agent sees it and thinks: "Maybe if I search again, I'll find a better deal." It retries with the same parameters, gets similar results, and the cycle continues.
Solution 1: Debounce Hook with Strands
Strands Hooks intercept the agent lifecycle at any point. A Debounce Hook uses BeforeToolCallEvent to detect duplicate calls before they execute:
from strands.hooks import HookProvider, BeforeToolCallEvent, BeforeInvocationEvent
class DebounceHook(HookProvider):
def __init__(self, window_size=3):
self.call_history = [] # Tracks (tool_name, input) pairs
self.window_size = window_size # Sliding window size for duplicate detection
self.blocked_count = 0
def register_hooks(self, registry):
# BeforeInvocationEvent fires once at the start of each agent.invoke() call
registry.add_callback(BeforeInvocationEvent, self.reset)
# BeforeToolCallEvent fires before every tool execution — this is where we intercept
registry.add_callback(BeforeToolCallEvent, self.check_duplicate)
def reset(self, event):
# Clear history at the start of each invocation so limits don't bleed across calls
self.call_history = []
def check_duplicate(self, event):
# Build a fingerprint from tool name + exact inputs
key = (event.tool_use["name"], str(event.tool_use["input"]))
recent = self.call_history[-self.window_size:]
if recent.count(key) >= 2:
# cancel_tool is a native Strands API: blocks execution and returns this message to the LLM
event.cancel_tool = "BLOCKED: Duplicate call detected"
self.blocked_count += 1
return
self.call_history.append(key)
agent = Agent(tools=[search_flights], hooks=[DebounceHook()])
The hook tracks the last 3 tool calls. If the same tool with the same parameters appears twice, the third attempt is blocked via event.cancel_tool, a native Strands API that prevents tool execution and returns an error message to the LLM.
Solution 2: Clear SUCCESS/FAILED States
Tools that return explicit terminal states help agents know when to stop:
@tool
def book_hotel(hotel: str, guest: str, nights: int) -> str:
"""Book a hotel room. Returns clear SUCCESS or FAILED.
Returns:
SUCCESS: Booking confirmed with ID
FAILED: Booking failed with reason
"""
if random.random() > 0.15:
conf = f"HT{random.randint(10000, 99999)}"
price = random.randint(150, 350)
return f"SUCCESS: Booking {conf} confirmed — {guest} at {hotel}, {nights} nights, ${price * nights} total"
return f"FAILED: {hotel} fully booked"
When the agent receives "SUCCESS: Booking HT79265 confirmed", it knows the task is done. No ambiguity, no extra calls.
Solution 3: Hard Limits with LimitToolCounts
CodiesHub recommends: "Iterations, tokens, time, spend are non-negotiable." Strands provides LimitToolCounts in the Hooks Cookbook — a hook that caps tool calls per invocation:
from strands.hooks import HookProvider, BeforeToolCallEvent, BeforeInvocationEvent
from threading import Lock
class LimitToolCounts(HookProvider):
"""Limits tool calls per invocation. From Strands Hooks Cookbook."""
def __init__(self, max_tool_counts: dict[str, int]):
# Per-tool call budgets: {"search_flights": 2} means max 2 searches per invocation
self.max_tool_counts = max_tool_counts
self.tool_counts = {}
self._lock = Lock() # Thread-safe for concurrent tool calls in Swarm scenarios
def register_hooks(self, registry):
registry.add_callback(BeforeInvocationEvent, self.reset_counts)
registry.add_callback(BeforeToolCallEvent, self.intercept_tool)
def reset_counts(self, event):
# Reset per invocation so limits apply per task, not per agent lifetime
with self._lock:
self.tool_counts = {}
def intercept_tool(self, event):
tool_name = event.tool_use["name"]
with self._lock:
max_count = self.max_tool_counts.get(tool_name)
count = self.tool_counts.get(tool_name, 0) + 1
self.tool_counts[tool_name] = count
if max_count and count > max_count:
# Hard ceiling: block the call and tell the LLM explicitly to stop
event.cancel_tool = f"Tool '{tool_name}' limit reached. DO NOT CALL ANYMORE."
# Enforce a hard limit of 2 flight searches per booking task — prevents runaway costs
limit_hook = LimitToolCounts(max_tool_counts={"search_flights": 2})
agent = Agent(tools=[search_flights], hooks=[limit_hook])
Even if the agent wants to search 10 times, it's capped at 2. Hard ceiling, predictable costs.
Demo Results
We tested with a travel booking agent that searches for flights and hotels:
| Scenario | Tool Calls | Time | Result |
|---|---|---|---|
| Ambiguous Feedback | 14 | 21s | Agent retried organically — "prices may change" caused loops |
| DebounceHook | 12 | 15s | Reduced retries but some variation in parameters |
| Clear SUCCESS States | 2 | 4s | Agent stopped immediately after SUCCESS |
| LimitToolCounts | 6 (2 blocked) | 6s | Hard ceiling enforced — no runaway |
The contrast is dramatic: 14 calls with ambiguous tools vs 2 calls with clear SUCCESS states. That is a 7x difference caused purely by tool feedback design.
When to Use Each Solution
DebounceHook — prevents duplicate calls with identical parameters. Use when tools are idempotent and retrying with the same input is wasteful.
Clear SUCCESS/FAILED states — the simplest solution. Design tools to return explicit terminal states. The agent knows when to stop.
LimitToolCounts — hard ceiling on tool calls per invocation. Use in production to prevent runaway costs regardless of tool design. From the Strands Hooks Cookbook.
All three together — defense in depth. Clear states prevent most loops, debounce catches duplicates, and hard limits guarantee bounded execution.
Try It Yourself
You need Python 3.9+, uv, and an OpenAI API key.
git clone https://github.com/aws-samples/sample-why-agents-fail
cd sample-why-agents-fail/stop-ai-agents-wasting-tokens/03-reasoning-loops-demo
uv venv && uv pip install -r requirements.txt
export OPENAI_API_KEY="your-key-here"
uv run python test_reasoning_loops.py # Runs all 4 scenarios
Or open test_reasoning_loops.ipynb in Jupyter, JupyterLab, VS Code, or your preferred notebook environment.
Key Takeaways
- Ambiguous tool feedback causes organic loops — "more results may be available" makes agents retry
- 14 calls vs 2 calls — clear SUCCESS states reduce calls by 7x in our demo
-
Hooks intercept before execution —
BeforeToolCallEvent.cancel_toolblocks the call before the tool runs. TheDebounceHookis ~30 lines of code - Hard limits are mandatory — every agent needs caps on iterations, time, and spend
- 847 steps at $47/min was documented (Particula, community observation) — unbounded agents burn money without delivering answers
Frequently Asked Questions
Why do AI agents repeat the same tool call?
Agents repeat tool calls when tool responses contain ambiguous feedback such as "more results may be available" or "prices change frequently." The LLM interprets these signals as a reason to retry, expecting different or better results. Without clear terminal states (SUCCESS/FAILED), the agent has no way to know the task is complete.
What is a DebounceHook and how does it prevent reasoning loops?
A DebounceHook tracks recent tool calls in a sliding window. When the same tool is called with identical parameters more than a set threshold (typically 2 times within a window of 3), the hook blocks the call using event.cancel_tool before the tool executes. The LLM receives a "BLOCKED: Duplicate call" message and must try a different approach. In Strands Agents, this is about 30 lines of code using the HookProvider API.
How do clear SUCCESS/FAILED states reduce tool calls?
When a tool returns "SUCCESS: Booking HT79265 confirmed," the LLM recognizes the task is complete and stops calling that tool. Ambiguous responses such as "Found 2 flights, more may be available" lack this signal, causing the agent to retry. In our demo, clear states reduced tool calls from 14 to 2, a 7x improvement.
References
Research
- Language models can overthink — The Decoder, Jan 2025
- How many reasoning steps do AI agents need — Particula (community observation), Jul 2025
- How to Prevent Infinite Loops and Spiraling Costs — CodiesHub (community observation), Dec 2025
Implementation
- Strands Hooks — Lifecycle event interception and tool cancellation
Gracias!


Top comments (2)
thanks
So interesting!! Especially the Debounce Hook with Strands 🙌