The agent is looping. It called search_documents(query="Q3 revenue") in step 2. Now it is step 7 and it is calling the exact same tool with the exact same arguments again. You are paying for the same search twice, getting the same result, and burning tokens on a loop you did not intend.
This is different from idempotency at the request level. This is within a single agent session: the same tool call fired twice because the model forgot it already tried that, or because the loop condition did not detect the cycle.
tool-call-dedup detects and blocks exact duplicate tool calls within a session.
The Shape of the Fix
from tool_call_dedup import ToolCallDedup, DuplicateToolCall
dedup = ToolCallDedup()
def execute_tool(name: str, args: dict) -> dict:
try:
dedup.check(name, args)
except DuplicateToolCall as e:
# Return the cached result instead of re-executing
return e.cached_result
result = call_tool(name, args)
dedup.record(name, args, result)
return result
First call: check() passes, record() stores the call fingerprint and result. Second identical call: check() raises DuplicateToolCall with the cached result attached. The caller returns the cached result without re-executing the tool.
What It Does NOT Do
tool-call-dedup does not deduplicate across sessions. The cache is in-memory and scoped to one ToolCallDedup instance. Create a new instance per agent run. For cross-session deduplication, use agentidemp-py at the request level.
It does not handle semantically equivalent calls. search_documents(query="Q3 revenue") and search_documents(query="Q3 Revenue") are different fingerprints even if they would return the same result. Fuzzy matching would require embedding comparisons, which is out of scope.
It does not automatically inject itself into the tool execution path. You wire it in. The library provides the check/record API; you decide where to call it in your agent loop.
Inside the Library
The fingerprint is a stable hash of the tool name and canonicalized arguments:
import hashlib
import json
class ToolCallDedup:
def __init__(self):
self._seen: dict[str, dict] = {}
self._lock = threading.Lock()
def _fingerprint(self, name: str, args: dict) -> str:
canonical = json.dumps({"name": name, "args": args}, sort_keys=True)
return hashlib.sha256(canonical.encode()).hexdigest()
def check(self, name: str, args: dict) -> None:
fp = self._fingerprint(name, args)
with self._lock:
if fp in self._seen:
raise DuplicateToolCall(
name=name,
args=args,
cached_result=self._seen[fp]["result"],
original_ts=self._seen[fp]["ts"],
)
def record(self, name: str, args: dict, result: Any) -> None:
fp = self._fingerprint(name, args)
with self._lock:
self._seen[fp] = {"result": result, "ts": time.time()}
def reset(self) -> None:
with self._lock:
self._seen.clear()
sort_keys=True in json.dumps ensures that {"a": 1, "b": 2} and {"b": 2, "a": 1} produce the same fingerprint. The SHA-256 hash is stable across Python versions (unlike hash()).
The DuplicateToolCall exception carries everything the caller needs:
class DuplicateToolCall(Exception):
def __init__(self, name: str, args: dict, cached_result: Any, original_ts: float):
self.name = name
self.args = args
self.cached_result = cached_result
self.original_ts = original_ts
super().__init__(f"Duplicate tool call: {name}")
The caller can either return the cached result silently (the agent never knows), or surface it as a note back to the model ("you already called this tool, here is what it returned"). The second approach can help break loops where the model keeps trying the same tool because it is not retaining the previous result.
When to Use It
Use it whenever your agent loop does not have built-in cycle detection. Most simple loops do not. The model can call the same search tool repeatedly when the search returns unhelpful results and the model cannot figure out a different approach.
Use it for tools with side effects where the side effect should only happen once per session. send_notification() called twice is a user experience problem. dedup.check() before every tool call with side effects prevents this without requiring the tool itself to be idempotent.
Use it alongside agent-loop-bound or llm-stop-conditions. Dedup catches exact cycles. Stop conditions catch loops that are not exact cycles but are still unproductive (many calls, no progress toward a final answer).
Skip it for tools where calling the same thing twice is intentional. If your agent polls a status endpoint in a loop, deduplication would break it. Create the ToolCallDedup instance with an exclude list for those tools.
Install
pip install git+https://github.com/MukundaKatta/tool-call-dedup
# Or from PyPI
pip install tool-call-dedup
from tool_call_dedup import ToolCallDedup, DuplicateToolCall
dedup = ToolCallDedup(exclude=["check_status", "get_current_time"])
def handle_tool_use(tool_name: str, tool_input: dict) -> dict:
try:
dedup.check(tool_name, tool_input)
except DuplicateToolCall as e:
logger.warning(
"duplicate_tool_call",
tool=tool_name,
original_ts=e.original_ts,
)
# Tell the model it already tried this
return {
"note": f"Already called {tool_name} with these arguments.",
"cached_result": e.cached_result,
}
result = tools[tool_name](**tool_input)
dedup.record(tool_name, tool_input, result)
return result
Sibling Libraries
| Library | What it solves |
|---|---|
agentidemp-py |
Request-level idempotency (across sessions, across processes) |
agent-loop-bound |
Hard cap on total loop iterations |
llm-stop-conditions |
Composable stop conditions including NoProgress detection |
tool-loop-guard |
Sliding-window repeated-call detector (allows N calls, then blocks) |
agent-step-log |
Log every tool call for post-hoc audit |
The cycle prevention stack: tool-call-dedup for exact session-scoped duplicates, tool-loop-guard for rate-based cycle detection, llm-stop-conditions for progress-based stopping, agent-loop-bound as the hard outer limit.
What's Next
Per-tool TTL: dedup.check() with a ttl_seconds argument that allows the same call after N seconds. Useful for tools that return changing data where a repeated call after 60 seconds is legitimate but a repeated call in the same 5-second window is a loop.
Soft mode: instead of raising DuplicateToolCall, return a bool so callers can decide whether to skip silently or surface the duplicate to the model. Some agent architectures prefer returning the cached result with no exception path.
Stats: dedup.stats() returning total calls, duplicate count, and per-tool breakdown. Useful for identifying which tools trigger the most cycles.
Built as part of the agent-stack family: composable Python primitives for production LLM agents.
Top comments (0)