DEV Community

Greg Smethells
Greg Smethells

Posted on

REEL: A Proper Name for the Autonomous Agent Loop

There is a pattern spreading through the AI coding community. You write a spec, an outer loop spawns fresh AI sessions for each task, acceptance checks gate progress, and the codebase accumulates working code — all without a human in the loop. The pattern works. But it is stuck with a name borrowed from a cartoon character.

The "Ralph Wiggum loop" is fun, but it does not belong in a design document. The pattern deserves a proper name — one that works as a verb, a noun, and a proper acronym. That name is REEL.

What is a REEL?

REEL stands for Review Evaluated Engineering Loop. It describes any autonomous coding workflow with these five characteristics:

  1. Spec in — a structured specification defines the tasks
  2. Fresh session per task — each task gets a clean context window (no accumulated drift)
  3. Acceptance check — every task must pass a defined verification (tests, linters, reviews)
  4. State on disk — progress is tracked in files, not in memory
  5. Iterate until done — failed tasks are retried, passing tasks feed context to the next

The key insight: the codebase is the memory, not the context window. Each fresh session reads the current state of the code, does its work, and writes results back to disk. The context window is disposable. The repo is permanent.

This makes REELs fundamentally different from long-running agent sessions. There is no context window to exhaust, no degradation over time, no catastrophic forgetting. Each task starts clean and inherits progress through the only artifact that matters — the code itself.

Why "REEL"?

The acronym earns every letter. Review Evaluated Engineering Loop — "review" because every task is review-gated before it passes; "evaluated" because acceptance checks evaluate each result; "engineering" because the output is engineered software, not just generated code; "loop" because it iterates until the spec is satisfied.

It works as a verb. "Reel in the feature" is natural English for persistent, iterative effort. You cast your line (write a spec), reel it in (iterate through tasks), and land the catch (working code). The fishing metaphor maps perfectly: patience, persistence, incremental progress.

It works as a noun. "I ran a REEL on the auth refactor" — clean and unambiguous.

It sounds like "real." The near-homophone is a feature, not an accident. When the REEL finishes, the spec becomes real. Working code, passing tests, reviewed and linted — real software, built autonomously.

Compare to RAG. Nobody says "Retrieval-Augmented Generation" in conversation — they say RAG. The acronym is the term. REEL works the same way: short, memorable, and precise enough that the expansion only matters the first time you hear it.

How It Works

A REEL follows this flow:

Spec directory (md/json/feature — any mix)
  |
  v
Parse tasks + dependencies + namespaces
  |
  v
Verification pass (run acceptance checks on existing code)
  |
  v
For each eligible task:
  |-> Create git worktree (isolated branch per task)
  |-> Spawn fresh AI session (with progress context)
  |-> Execute task in worktree
  |-> Run acceptance check
  |-> Run lint gate (fresh session)
  |-> Run review gate (fresh session)
  |-> Merge worktree into feature branch
  |-> Record progress
  |-> Update state
  |
  v
Next task (or retry on failure with fresh worktree)
  |
  v
Advance spec to next pipeline stage
Enter fullscreen mode Exit fullscreen mode

Worktree Isolation

Each task gets its own git worktree branched from the feature branch. The main worktree is the orchestration desk — state, progress, and logs live there, untouched by task work. When a task passes, its worktree merges back into the feature branch. When a task fails, the worktree is discarded and a fresh one is created for the retry, branched from the latest feature branch state (which includes all previously merged work).

This eliminates an entire class of problems: cross-task file contamination, git add -A staging another task's files, and concurrent edits clobbering each other.

Three Quality Gates

Every task must pass three gates before it is marked as done:

  1. Acceptance check — the spec-defined verification command (tests, build, custom script)
  2. Lint gate — a fresh AI session runs language-specific linters on changed files
  3. Review gate — a fresh AI session runs a code review, blocking on Minor or higher severity findings

Each gate runs in its own session with its own budget. The lint and review gates can be disabled with --no-lint and --no-review for speed.

State Management

Per-task progress (pass/fail, attempt counts, errors) is tracked in .reel/state.json — spec files are never modified. You can abort at any time — SIGINT/SIGTERM traps save state — and resume exactly where you left off. When all tasks in a spec pass, the spec file advances to the next pipeline stage.

Progress Accumulation

After each task passes, a summary (description + files changed) is appended to .reel/progress.txt. Subsequent tasks receive this context in their prompt, preventing duplication across sessions without polluting the context window.

Safety Controls

  • Max iterations: 20 total iterations across all tasks (configurable)
  • Per-task budget: $1.00 default spend cap per task
  • Retry limit: 3 attempts per task before marking as failed
  • Circuit breaker: 3 consecutive failures halt the entire run
  • Lock file: prevents concurrent runs on the same spec
  • Graceful abort: SIGINT/SIGTERM traps save state for resume

Pipeline Stages

REEL organizes specs into a pipeline. The filesystem is the board — a spec file's directory determines its stage:

spec/
  todo/       — Approved backlog, ready to work
  design/     — In design phase
  inflight/   — Being implemented (REEL core behavior)
  review/     — In review phase
  active/     — Done, actively enforced specification
  archived/   — Historical, out of flow
Enter fullscreen mode Exit fullscreen mode

When all tasks in a spec pass, REEL advances it to the next stage via git mv. The transition is atomic and auditable — it is just a commit. You can monitor the pipeline visually with /kanban, a terminal TUI that displays specs as cards across stage columns with live 2-second auto-refresh.

REEL Kanban Board

Spec Formats

A REEL is driven by structured specs in three formats — use whichever fits your workflow. Mix formats freely within the same pipeline directory.

Gherkin (.feature files) — recommended for larger specs:

Each Scenario becomes a task. Tag-based metadata provides stable IDs, dependencies, and acceptance overrides:

Feature: Auth Refactor

  @id:a1b2c3d4
  @acceptance:pytest tests/test_session.py
  Scenario: Extract session manager
    Given route handlers contain inline session logic
    When the session manager module is created
    Then all session operations are delegated to the new module

  @id:e5f6a7b8
  @depends-on:a1b2c3d4
  @acceptance:pytest tests/test_refresh.py
  Scenario: Add token refresh
    Given the session manager exists
    When a token expires during a request
    Then the token is refreshed automatically
Enter fullscreen mode Exit fullscreen mode

IDs are 8-character lowercase hex (truncated UUID v4) — globally unique and stable across refactors. Generate them with reel --generate-id [spec-dir], which scans existing IDs to prevent collisions. Feature files without @id: tags fall back to auto-generated sequential IDs.

Markdown (heading-per-task):

## a1b2c3d4: Extract session manager

Move session handling out of route handlers into a dedicated module.

**Acceptance:** `pytest tests/test_session.py`

## e5f6a7b8: Add token refresh

Implement automatic token refresh in the session manager.

**Acceptance:** `pytest tests/test_refresh.py`
**Depends on:** a1b2c3d4
Enter fullscreen mode Exit fullscreen mode

JSON (structured):

{
  "name": "Auth Refactor",
  "tasks": [
    { "id": "a1b2c3d4", "description": "Extract session manager", "acceptance": "pytest tests/test_session.py" },
    { "id": "e5f6a7b8", "description": "Add token refresh", "acceptance": "pytest tests/test_refresh.py", "depends_on": ["a1b2c3d4"] }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Automatic namespacing: In directory mode, each file's tasks are prefixed with the file stem to prevent ID collisions. spec/inflight/auth.md produces auth:a1b2c3d4, auth:e5f6a7b8. Within-file dependencies use bare IDs (auto-resolved); cross-file dependencies use qualified IDs like auth:a1b2c3d4.

Reference Implementation

The claude-config project includes a working REEL implementation for Claude Code:

  • /reel skill — state-aware launcher inside Claude Code (validates spec, shows dry-run, offers resume/restart/verify, invokes orchestrator)
  • scripts/reel.py — Python 3.11+ orchestrator (stdlib only, no pip packages) that runs outside the AI coding tool, spawning fresh claude -p sessions per task
  • /kanban skill — opens a terminal Kanban board (Textual TUI) for visual pipeline monitoring

It is open source (MIT), works with Claude Code today, and installs in one command:

sh -c "$(curl -fsSL https://gitlab.com/oldmission/claude-config/-/raw/main/install.sh)"
Enter fullscreen mode Exit fullscreen mode

Then run /reel init to scaffold a spec, edit it with your tasks, and run /reel spec/ to execute. (GitHub mirror)

History

Geoffrey Huntley identified the autonomous agent loop pattern in mid-2025, demonstrating that writing structured specs and feeding them to fresh AI sessions produced better results than marathon single-session coding.

The community adopted the pattern enthusiastically in late 2025, calling it the "Ralph Wiggum loop" — a pop-culture name that went viral but lacks technical precision.

REEL provides a culturally agnostic, technically precise replacement. The pattern itself is tool-agnostic — it works with any AI coding CLI that supports non-interactive mode. Its name should be too.

Try It

  1. Install: sh -c "$(curl -fsSL https://gitlab.com/oldmission/claude-config/-/raw/main/install.sh)"
  2. Create tasks: /design
  3. Run: /reel
  4. Monitor: /kanban

The pattern has earned a real name. Next time you describe it, try calling it a REEL — and see if the conversation gets a little clearer.

Top comments (1)

Collapse
 
nyrok profile image
Hamza KONTE

Love the naming effort — the field desperately needs shared vocabulary. "REEL" is memorable and the acronym works.

One thing worth adding to the loop model: the prompt structure feeding into each cycle matters as much as the loop architecture itself. Agents that receive freeform text instructions drift significantly more across iterations than those receiving structured prompts (role + objective + constraints + output format as distinct elements). The structured contract gives the model a stable anchor across iterations.

The "Evaluate" step in your REEL model is where this shows up most — evaluation criteria need to be explicit, not inferred. An agent that "checks its own work" against vague goals will pass things it shouldn't. Explicit constraints in the prompt make the evaluation step deterministic rather than vibes-based.

Good framing overall. Having a name for this helps communicate the architecture to non-technical stakeholders.