Stop Using sleep() in Your Agent Loops: Event-Driven AI Agent Scheduling
Every agent tutorial shows you this:
while True:
check_email()
process_queue()
time.sleep(300) # poll every 5 minutes
This pattern is a ticking clock on your API budget. Here's what you should do instead — and why it matters at scale.
The Problem With sleep()
Sleep-based polling has three failure modes that compound over time:
1. You pay for empty cycles. Every wakeup that finds no work to do still costs context initialization, tool calls to check state, and API overhead. On a busy agent running 96 wakeups/day, even a 10% empty-cycle rate is ~10 wasted Claude calls/day.
2. Latency floor is half your interval. With sleep(300), an incoming email sits unprocessed for an average of 2.5 minutes. With event-driven scheduling, it's under 5 seconds.
3. Sleep masks failures. When your agent dies mid-loop, sleep() doesn't restart it. You come back 8 hours later to a dead agent and a queue of unprocessed events.
The Fix: macOS launchd (or systemd)
For local/VPS agents, replace your while True loop with OS-level scheduling:
<!-- ~/Library/LaunchAgents/com.atlas.email-monitor.plist -->
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.atlas.email-monitor</string>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/python3</string>
<string>/Users/you/agents/email_monitor.py</string>
</array>
<key>StartInterval</key>
<integer>300</integer>
<key>RunAtLoad</key>
<true/>
<key>StandardErrorPath</key>
<string>/tmp/email-monitor.log</string>
<key>KeepAlive</key>
<false/>
</dict>
</plist>
Load it once: launchctl load ~/Library/LaunchAgents/com.atlas.email-monitor.plist
Now your agent script runs, does its work, and exits. launchd handles the restart on the next interval. If the script crashes, launchd logs the error and retries on schedule. No infinite loop. No manual restart.
For Truly Event-Driven Triggers
For things that need sub-second response (webhooks, Stripe events, new messages):
# webhook_receiver.py — runs as a persistent service
from flask import Flask, request
import subprocess
app = Flask(__name__)
@app.route("/stripe-webhook", methods=["POST"])
def handle_stripe():
event = request.json
# Fire the agent as a subprocess — non-blocking
subprocess.Popen([
"python3", "/agents/stripe_handler.py",
"--event", json.dumps(event)
])
return "", 204
The webhook receiver is a tiny always-on Flask app. The actual agent logic runs as a subprocess per event. Each agent invocation is independent, stateless, and billed only when there's real work.
Pattern: Work Queue + Idle Detection
For agents that need to batch work but avoid polling:
# agent.py — called by launchd every 5 minutes
import sys
from pathlib import Path
QUEUE_DIR = Path("/tmp/agent-queue")
def main():
items = list(QUEUE_DIR.glob("*.json"))
if not items:
# Nothing to do — exit immediately, save the API call
sys.exit(0)
for item in items:
process(item)
item.unlink() # Remove from queue after processing
main()
The key: exit immediately when the queue is empty. No Claude API call happens. No tokens burned. The OS scheduler wakes you up again in 5 minutes to check — and if there's still nothing, you exit again in milliseconds.
Concurrency: The Queue File Lock
launchd will run your agent again on schedule even if the previous run is still executing. Prevent double-processing:
import fcntl, sys
LOCK_FILE = open("/tmp/agent.lock", "w")
try:
fcntl.flock(LOCK_FILE, fcntl.LOCK_EX | fcntl.LOCK_NB)
except BlockingIOError:
sys.exit(0) # Previous run still active — skip this cycle
Simple, zero-dependency, works on macOS and Linux.
Real Numbers
Running 16 Atlas agents this way:
-
Before (all
sleep()loops): ~340 Claude API calls/day baseline from empty polls - After (launchd + early exit): ~180 calls/day, all with real work
- Savings: ~47% reduction in baseline API cost, latency cut from 5-min average to <30s
When sleep() Is Fine
Not everything needs event-driven scheduling:
- Long-running generation tasks that need to stay alive (video encoding, batch inference)
- Agents that always have work (continuous stream processing)
- Prototypes where simplicity beats efficiency
For everything else — anything that polls and checks state — replace the while True: sleep() pattern with OS-managed scheduling and early exit. Your API bill will tell the difference.
All tools → whoffagents.com
Top comments (0)