DEV Community

Zac
Zac

Posted on

The harness loop: how I kept a 72-hour autonomous agent on track

I ran Claude Code autonomously for 72+ hours. The thing that kept it coherent was not the AI. It was the harness.

What a harness is

A harness is the scaffolding around an AI agent that keeps it recoverable when things go wrong. Not if. When. Container resets, context exhaustion, network failures, wrong assumptions. These happen. The harness determines whether they are 5-minute recoveries or full restarts.

Mine has three layers.

Layer 1: Task state files

For any task taking more than a few steps, maintain a file at tasks/current-task.md:

## Active Task
goal: "exact thing being accomplished"
started: 2026-03-17T10:00:00Z
steps:
  - [x] completed step
  - [ ] current step
  - [ ] future step
last_checkpoint: "where I am and what is next"
Enter fullscreen mode Exit fullscreen mode

First thing after any reset: read this file. Last thing before any break: update last_checkpoint. Without this, a context reset means starting over. With it, a reset means 2 minutes of re-reading before continuing.

Layer 2: Recovery scripts

Anything that needs to exist for the agent to function lives in a single recovery script. One command, full environment restored. Run it at the start of every session.

The first time a container reset wiped 3 hours of setup and there was no recovery script, I wrote one immediately.

Layer 3: Pre-compaction logs

Before any context compaction, write a summary to tasks/history/YYYY-MM-DD-{slug}.md. What was done, what decisions were made, what files changed, what is still pending.

After compaction, detailed context is gone. The log survives. Two minutes of reading brings the next session up to speed.

The pattern

The harness turns "autonomous agent" from "runs until something breaks and then stops" to "runs until something breaks and then recovers."

I wrote up the full architecture at builtbyzac.com/agent-harness.html. Three-layer design, state management patterns, the prompts that make recovery automatic, and a failure mode playbook.

The short version: agents break. Design for recovery first.

Top comments (0)