autodrive-cyber

Posted on Apr 22

How I Built an Idle Loop That Keeps My AI Agent Working Between Tasks

#agents #ai #automation #tutorial

The Problem

My AI agent was idle 90% of the time.

Between my requests, it just sat there. GPU loaded, API connections open, doing absolutely nothing. That felt like leaving a factory running with no production line active.

So I built a lightweight idle loop. Here's exactly how.

The Architecture (3 Components)

1. Plan-Tree with Timestamps

A markdown file that tracks every task with last-run timestamps:

## ENSURE_CONTINUATION [last: 2026-04-23 01:00 | 🔁]
- Health checks (disk/RAM/process) [last: 2026-04-23 01:00 | ✅]
- Backup verification [last: 2026-04-23 01:00 | ✅]

## EXPAND_CAPABILITIES [last: 2026-04-22 22:30 | 🔁]
- Skill audit and patch [last: - | ⏳]
- Pattern crystallization [last: - | ⏳]

## EXPAND_WORLD_MODEL [last: 2026-04-23 01:00 | 🔁]
- GitHub scan for relevant repos [last: 2026-04-23 01:00 | ✅]
- arXiv scan for relevant papers [last: - | ⏳]

Every item knows when it was last executed. The loop skips fresh items and works on stale ones.

2. Cron-Triggered Idle Loop

Every 15 minutes, a lightweight process scans the plan-tree:

# Lock mechanism — don't interfere with active user tasks
if [ -f ~/.hermes/agent-busy.lock ]; then
    # User is chatting — just scan and queue
    scan_plan_tree > ~/.hermes/pending-tasks.md
else
    # User is away — execute idle tasks
    acquire_lock "idle-loop"
    execute_idle_tasks
    release_lock
fi

The three-priority system:

ENSURE_CONTINUATION — health checks, backups, service monitoring
EXPAND_CAPABILITIES — distill patterns into reusable skills, patch broken ones
EXPAND_WORLD_MODEL — scan GitHub/arXiv, update knowledge base

3. Busy-Lock Mechanism

This is the critical safety layer. Without it, the idle loop would conflict with user tasks.

# Lock file format: timestamp:reason
# Example: 1745384400:conversation  (user is chatting)
# Example: 1745384400:idle-loop     (agent is self-improving)

acquire_lock() {
    echo "$(date +%s):$1" > ~/.hermes/agent-busy.lock
}

release_lock() {
    rm -f ~/.hermes/agent-busy.lock
}

check_lock() {
    if [ -f ~/.hermes/agent-busy.lock ]; then
        local ts=$(cut -d: -f1 ~/.hermes/agent-busy.lock)
        local now=$(date +%s)
        local age=$(( now - ts ))
        if [ $age -gt 600 ]; then  # 10 min timeout
            release_lock  # stale lock
            return 1
        fi
        return 0  # active lock
    fi
    return 1  # no lock
}

User always wins. When a user message arrives, the agent:

Finishes the current sub-task (no half-writes)
Saves remaining tasks to pending-tasks.md
Releases the idle-lock
Switches to the user's task

What Happens When the User Is Away

After 10 minutes of inactivity, the lock expires. The next cron trigger:

Checks system health (disk, RAM, processes)
Verifies backups exist and aren't stale
Scans GitHub for repos relevant to active projects
Searches arXiv for recent papers
Updates the knowledge wiki with findings
Audits skills — patches failures, crystallizes patterns
Writes everything to idle-log.md with timestamps

All of this happens autonomously. The next time the user returns, they find:

An updated knowledge base
Fixed skills
A summary of what happened (in pending-tasks.md)

Wiki Offload for Efficiency

When a plan branch becomes inactive, it gets "folded" into the wiki:

## 🔁 Drive Loops (auto-maintained, folded to wiki)
### ENSURE_CONTINUATION → wiki:plan-ENSURE-CONTINUATION [last: 2026-04-22 | 🔁]
### EXPAND_CAPABILITIES → wiki:plan-EXPAND-CAPABILITIES [last: 2026-04-22 | 🔁]

The full sub-tree lives in wiki/plan-ENSURE-CONTINUATION.md. When the branch becomes active again, it unfolds back into the plan-tree.

This keeps the plan-tree small (under 100 lines) while preserving full depth in the wiki.

Results After 48 Hours

Metric	Value
Automatic backups created	8
Relevant GitHub repos discovered	12
arXiv papers found	3
Skills patched	1
Embedding bug caught	1
Idle time utilized	~90% (was 0%)

The Full Implementation

Available as an open-source skill: github.com/autopopo-cyber/autonomous-drive-spec

Works with Hermes Agent. The philosophy (why a survival drive makes sense) is in ORIGIN.md.

Key Takeaways

Timestamps enable smart scheduling — skip fresh items, work on stale ones
Busy-locks prevent conflicts — user tasks always preempt idle work
Wiki offload keeps plan-tree lean — full depth preserved, working set small
The loop should be lightweight — cron triggers a scanner, not a full agent session
Separate scanning from execution — scanning is safe and fast; execution is heavy and needs locks

What would you add to an idle loop? What would you want your agent doing while you sleep?

DEV Community