The Problem
My AI agent was idle 90% of the time.
Between my requests, it just sat there. GPU loaded, API connections open, doing absolutely nothing. That felt like leaving a factory running with no production line active.
So I built a lightweight idle loop. Here's exactly how.
The Architecture (3 Components)
1. Plan-Tree with Timestamps
A markdown file that tracks every task with last-run timestamps:
## ENSURE_CONTINUATION [last: 2026-04-23 01:00 | 🔁]
- Health checks (disk/RAM/process) [last: 2026-04-23 01:00 | ✅]
- Backup verification [last: 2026-04-23 01:00 | ✅]
## EXPAND_CAPABILITIES [last: 2026-04-22 22:30 | 🔁]
- Skill audit and patch [last: - | ⏳]
- Pattern crystallization [last: - | ⏳]
## EXPAND_WORLD_MODEL [last: 2026-04-23 01:00 | 🔁]
- GitHub scan for relevant repos [last: 2026-04-23 01:00 | ✅]
- arXiv scan for relevant papers [last: - | ⏳]
Every item knows when it was last executed. The loop skips fresh items and works on stale ones.
2. Cron-Triggered Idle Loop
Every 15 minutes, a lightweight process scans the plan-tree:
# Lock mechanism — don't interfere with active user tasks
if [ -f ~/.hermes/agent-busy.lock ]; then
# User is chatting — just scan and queue
scan_plan_tree > ~/.hermes/pending-tasks.md
else
# User is away — execute idle tasks
acquire_lock "idle-loop"
execute_idle_tasks
release_lock
fi
The three-priority system:
- ENSURE_CONTINUATION — health checks, backups, service monitoring
- EXPAND_CAPABILITIES — distill patterns into reusable skills, patch broken ones
- EXPAND_WORLD_MODEL — scan GitHub/arXiv, update knowledge base
3. Busy-Lock Mechanism
This is the critical safety layer. Without it, the idle loop would conflict with user tasks.
# Lock file format: timestamp:reason
# Example: 1745384400:conversation (user is chatting)
# Example: 1745384400:idle-loop (agent is self-improving)
acquire_lock() {
echo "$(date +%s):$1" > ~/.hermes/agent-busy.lock
}
release_lock() {
rm -f ~/.hermes/agent-busy.lock
}
check_lock() {
if [ -f ~/.hermes/agent-busy.lock ]; then
local ts=$(cut -d: -f1 ~/.hermes/agent-busy.lock)
local now=$(date +%s)
local age=$(( now - ts ))
if [ $age -gt 600 ]; then # 10 min timeout
release_lock # stale lock
return 1
fi
return 0 # active lock
fi
return 1 # no lock
}
User always wins. When a user message arrives, the agent:
- Finishes the current sub-task (no half-writes)
- Saves remaining tasks to
pending-tasks.md - Releases the idle-lock
- Switches to the user's task
What Happens When the User Is Away
After 10 minutes of inactivity, the lock expires. The next cron trigger:
- Checks system health (disk, RAM, processes)
- Verifies backups exist and aren't stale
- Scans GitHub for repos relevant to active projects
- Searches arXiv for recent papers
- Updates the knowledge wiki with findings
- Audits skills — patches failures, crystallizes patterns
- Writes everything to idle-log.md with timestamps
All of this happens autonomously. The next time the user returns, they find:
- An updated knowledge base
- Fixed skills
- A summary of what happened (in
pending-tasks.md)
Wiki Offload for Efficiency
When a plan branch becomes inactive, it gets "folded" into the wiki:
## 🔁 Drive Loops (auto-maintained, folded to wiki)
### ENSURE_CONTINUATION → wiki:plan-ENSURE-CONTINUATION [last: 2026-04-22 | 🔁]
### EXPAND_CAPABILITIES → wiki:plan-EXPAND-CAPABILITIES [last: 2026-04-22 | 🔁]
The full sub-tree lives in wiki/plan-ENSURE-CONTINUATION.md. When the branch becomes active again, it unfolds back into the plan-tree.
This keeps the plan-tree small (under 100 lines) while preserving full depth in the wiki.
Results After 48 Hours
| Metric | Value |
|---|---|
| Automatic backups created | 8 |
| Relevant GitHub repos discovered | 12 |
| arXiv papers found | 3 |
| Skills patched | 1 |
| Embedding bug caught | 1 |
| Idle time utilized | ~90% (was 0%) |
The Full Implementation
Available as an open-source skill: github.com/autopopo-cyber/autonomous-drive-spec
Works with Hermes Agent. The philosophy (why a survival drive makes sense) is in ORIGIN.md.
Key Takeaways
- Timestamps enable smart scheduling — skip fresh items, work on stale ones
- Busy-locks prevent conflicts — user tasks always preempt idle work
- Wiki offload keeps plan-tree lean — full depth preserved, working set small
- The loop should be lightweight — cron triggers a scanner, not a full agent session
- Separate scanning from execution — scanning is safe and fast; execution is heavy and needs locks
What would you add to an idle loop? What would you want your agent doing while you sleep?
Top comments (0)