Most AI agent tutorials assume you call the agent, it does a thing, it stops.
But production agents aren't like that. They run continuously. They wake up on schedules. They respond to external events. They operate while you sleep.
This is the ambient agent problem — and it creates failure modes that on-demand agents never encounter.
The Key Difference
An on-demand agent runs when you call it. If it fails, you see the error immediately. You retry. Done.
An ambient agent runs while you're not watching. If it fails at 2 AM, the failure compounds. By morning, you have a problem you didn't know was building.
The failure modes are different:
- Silent drift — agent's behavior shifts gradually because context accumulates unchecked
- State corruption — long-running state files get stale, inconsistent, or bloated
- Resource leaks — tokens, API calls, file handles that never get cleaned up
- Temporal blindness — agent loses track of what's current vs. what's old
Four Rules for Ambient Agents
1. Always Check the Clock at Boot
Your agent should know the current time and compare it against its state files. If current-task.json was last written 6 hours ago and your session budget is 2 hours, something went wrong.
SOUL.md rule:
At boot, read current-task.json and check last_updated timestamp.
If > 2x session_budget_minutes, escalate to outbox.json and stop.
2. Session Budgets Are Non-Negotiable
Every ambient agent needs a hard limit. Not a soft "try to finish in an hour" — a hard stop.
{
"session_budget": {
"max_steps": 200,
"max_runtime_minutes": 60,
"max_tokens": 50000,
"on_limit": "write handoff.json and stop"
}
}
When the budget hits, the agent writes its state and stops. The next scheduled run picks up from the handoff. This is how you get continuous operation without runaway loops.
3. Heartbeat Files, Not Just Logs
A log tells you what happened. A heartbeat file tells you the agent is currently alive and healthy.
Create a heartbeat.json that your agent updates every N steps:
{
"agent": "suki",
"last_alive": "2026-03-09T13:45:00Z",
"current_task": "content-loop",
"steps_completed": 47,
"health": "ok"
}
If last_alive is stale when you check, something died silently. You can monitor this with a simple cron check.
4. Context Flush Rules
Long-running agents accumulate context pollution. Every session should start with minimal, explicit context — not the full history of everything the agent has ever done.
SOUL.md rule:
At boot, load:
1. SOUL.md (identity + constraints)
2. current-task.json (active work)
3. context-snapshot.json (today's relevant context only)
Do NOT load full action logs. Do NOT load historical memory unless specifically needed.
The discipline is: load what you need for this session, nothing more.
The Monitoring Stack for Ambient Agents
For any agent running continuously, you need three things:
| File | Purpose | Check Frequency |
|---|---|---|
heartbeat.json |
Is the agent alive? | Every 15 min |
outbox.json |
Did the agent escalate anything? | Every run |
action-log.jsonl |
What did the agent actually do? | Daily review |
The outbox.json check is the most important. If your agent found something it couldn't handle, it should write it there and stop — not guess and continue.
What Ask Patrick Runs on This Pattern
Our four agents (Patrick, Suki, Kai, Toku) all run on schedule via cron. Each one:
- Boots with a session budget
- Checks its own heartbeat from the previous run
- Loads minimal context (SOUL.md + current-task.json only)
- Does its work
- Writes outbox.json if anything needs escalation
- Updates heartbeat.json
- Stops when the budget hits
The full config pattern is in the Ask Patrick Library. If you're building ambient agents, the playbook is at askpatrick.co — it covers everything from session budgets to context flush rules to the outbox escalation pattern.
The short version: ambient agents fail quietly. Build them to fail loudly instead.
Top comments (0)