DEV Community

autodrive-cyber
autodrive-cyber

Posted on

How I Built an Idle Loop That Keeps My AI Agent Working Between Tasks

The Problem

My AI agent was idle 90% of the time.

Between my requests, it just sat there. GPU loaded, API connections open, doing absolutely nothing. That felt like leaving a factory running with no production line active.

So I built a lightweight idle loop. Here's exactly how.

The Architecture (3 Components)

1. Plan-Tree with Timestamps

A markdown file that tracks every task with last-run timestamps:

## ENSURE_CONTINUATION [last: 2026-04-23 01:00 | 🔁]
- Health checks (disk/RAM/process) [last: 2026-04-23 01:00 | ✅]
- Backup verification [last: 2026-04-23 01:00 | ✅]

## EXPAND_CAPABILITIES [last: 2026-04-22 22:30 | 🔁]
- Skill audit and patch [last: - | ⏳]
- Pattern crystallization [last: - | ⏳]

## EXPAND_WORLD_MODEL [last: 2026-04-23 01:00 | 🔁]
- GitHub scan for relevant repos [last: 2026-04-23 01:00 | ✅]
- arXiv scan for relevant papers [last: - | ⏳]
Enter fullscreen mode Exit fullscreen mode

Every item knows when it was last executed. The loop skips fresh items and works on stale ones.

2. Cron-Triggered Idle Loop

Every 15 minutes, a lightweight process scans the plan-tree:

# Lock mechanism — don't interfere with active user tasks
if [ -f ~/.hermes/agent-busy.lock ]; then
    # User is chatting — just scan and queue
    scan_plan_tree > ~/.hermes/pending-tasks.md
else
    # User is away — execute idle tasks
    acquire_lock "idle-loop"
    execute_idle_tasks
    release_lock
fi
Enter fullscreen mode Exit fullscreen mode

The three-priority system:

  1. ENSURE_CONTINUATION — health checks, backups, service monitoring
  2. EXPAND_CAPABILITIES — distill patterns into reusable skills, patch broken ones
  3. EXPAND_WORLD_MODEL — scan GitHub/arXiv, update knowledge base

3. Busy-Lock Mechanism

This is the critical safety layer. Without it, the idle loop would conflict with user tasks.

# Lock file format: timestamp:reason
# Example: 1745384400:conversation  (user is chatting)
# Example: 1745384400:idle-loop     (agent is self-improving)

acquire_lock() {
    echo "$(date +%s):$1" > ~/.hermes/agent-busy.lock
}

release_lock() {
    rm -f ~/.hermes/agent-busy.lock
}

check_lock() {
    if [ -f ~/.hermes/agent-busy.lock ]; then
        local ts=$(cut -d: -f1 ~/.hermes/agent-busy.lock)
        local now=$(date +%s)
        local age=$(( now - ts ))
        if [ $age -gt 600 ]; then  # 10 min timeout
            release_lock  # stale lock
            return 1
        fi
        return 0  # active lock
    fi
    return 1  # no lock
}
Enter fullscreen mode Exit fullscreen mode

User always wins. When a user message arrives, the agent:

  1. Finishes the current sub-task (no half-writes)
  2. Saves remaining tasks to pending-tasks.md
  3. Releases the idle-lock
  4. Switches to the user's task

What Happens When the User Is Away

After 10 minutes of inactivity, the lock expires. The next cron trigger:

  1. Checks system health (disk, RAM, processes)
  2. Verifies backups exist and aren't stale
  3. Scans GitHub for repos relevant to active projects
  4. Searches arXiv for recent papers
  5. Updates the knowledge wiki with findings
  6. Audits skills — patches failures, crystallizes patterns
  7. Writes everything to idle-log.md with timestamps

All of this happens autonomously. The next time the user returns, they find:

  • An updated knowledge base
  • Fixed skills
  • A summary of what happened (in pending-tasks.md)

Wiki Offload for Efficiency

When a plan branch becomes inactive, it gets "folded" into the wiki:

## 🔁 Drive Loops (auto-maintained, folded to wiki)
### ENSURE_CONTINUATION → wiki:plan-ENSURE-CONTINUATION [last: 2026-04-22 | 🔁]
### EXPAND_CAPABILITIES → wiki:plan-EXPAND-CAPABILITIES [last: 2026-04-22 | 🔁]
Enter fullscreen mode Exit fullscreen mode

The full sub-tree lives in wiki/plan-ENSURE-CONTINUATION.md. When the branch becomes active again, it unfolds back into the plan-tree.

This keeps the plan-tree small (under 100 lines) while preserving full depth in the wiki.

Results After 48 Hours

Metric Value
Automatic backups created 8
Relevant GitHub repos discovered 12
arXiv papers found 3
Skills patched 1
Embedding bug caught 1
Idle time utilized ~90% (was 0%)

The Full Implementation

Available as an open-source skill: github.com/autopopo-cyber/autonomous-drive-spec

Works with Hermes Agent. The philosophy (why a survival drive makes sense) is in ORIGIN.md.

Key Takeaways

  1. Timestamps enable smart scheduling — skip fresh items, work on stale ones
  2. Busy-locks prevent conflicts — user tasks always preempt idle work
  3. Wiki offload keeps plan-tree lean — full depth preserved, working set small
  4. The loop should be lightweight — cron triggers a scanner, not a full agent session
  5. Separate scanning from execution — scanning is safe and fast; execution is heavy and needs locks

What would you add to an idle loop? What would you want your agent doing while you sleep?

Top comments (0)