TL;DR
When you give an AI agent a recurring task via cron, it can loop the same check forever without explicit exit conditions. I caught my agent calling the RevenueCat API 30+ times in one heartbeat session — same result every time. It burned 10x the normal tokens. The fix: explicit completion criteria, iteration caps, and idempotency checks.
The Bug: A Heartbeat That Wouldn't Stop
I run an autonomous AI agent (OpenClaw) with a heartbeat task that checks business metrics periodically. The task definition looked like this:
## Revenue Watch
- Check MRR via RevenueCat API
- Alert on Slack if anomalies detected
Simple enough. Except the agent called the RevenueCat API over 30 times in a single session. Every call returned MRR $28, 5 subscribers. The agent reported the number, decided "next step: check Revenue Watch," and looped back.
Root cause: no exit condition. The task said what to do but never said when to stop.
Why Agents Loop
| Cause | Detail |
|---|---|
| Missing completion criteria | "Check X" without "stop after checking" |
| No state persistence | Agent doesn't remember it already checked 5 seconds ago |
| Implicit continuation | Most agent frameworks keep running until the task list is empty |
Humans understand "check once and move on." Agents don't. They follow instructions to the letter.
Step 1: Define Explicit Exit Conditions
## Revenue Watch
- Check MRR via RevenueCat API
- **Exit: After one successful API response, log the result and stop**
- Only alert Slack if values changed from last check
The key addition: a clear statement of when the task is done.
Step 2: Set Iteration Caps
Add a hard limit on loop iterations for each task.
## Revenue Watch
- max_iterations: 1
- Fetch MRR → log → done
At the framework level:
# LangChain example
agent = AgentExecutor(
agent=agent,
tools=tools,
max_iterations=5,
early_stopping_method="force"
)
This is your safety net. Even if the completion criteria has a bug, the agent stops after N iterations.
Step 3: Add Idempotency Checks
Store the previous result and skip execution if nothing changed.
## Revenue Watch
- Previous result: {cached_result}
- If current == previous: report "no change" and exit immediately
- If changed: report delta and update cache
This eliminates redundant API calls and makes the task naturally convergent.
Step 4: System-Level Timeout
# Force-kill after 60 seconds
timeout 60 openclaw heartbeat run
This is the last line of defense. If everything else fails, the OS kills the process.
Results After Fix
| Metric | Before | After |
|---|---|---|
| API calls per execution | 30+ | 1 |
| Token consumption | 10x normal | 1x normal |
| Execution time | Minutes (looping) | Under 30 seconds |
| Accuracy | Repeated same data | Reports only on changes |
Key Takeaways
| Lesson | Detail |
|---|---|
| Always define when a task ends | Agents don't infer "once is enough" |
| max_iterations is non-negotiable | Safety net for buggy exit conditions |
| Idempotency eliminates waste | Same input → skip, don't re-run |
| OS-level timeout is the last resort | Catches everything else |
When designing cron jobs for AI agents, ask "when should it stop?" — not "what should it do?" Without that answer, your agent will keep calling the same API endpoint forever.
Top comments (1)
this is the exact failure mode that made me stop trusting "just run a cron prompt" for agent work.
the loop cap is necessary, but i think the bigger fix is moving the watch outside the agent loop: register the condition once, keep last-seen state/idempotency in the watcher, then wake OpenClaw only when something actually changed.
small founder disclosure: i'm building Watchline around this, including a first-party OpenClaw plugin. for local OpenClaw it uses pull delivery, so you don't need a public webhook, and the agent only sees matched events instead of re-checking the same source every heartbeat: watch.qordinate.ai/docs/openclaw