DEV Community

Cover image for Everyone Just Discovered Loop Engineering. REAP Got There First — and It's Ready When You Are
Hichoi-Dev
Hichoi-Dev

Posted on

Everyone Just Discovered Loop Engineering. REAP Got There First — and It's Ready When You Are

In June 2026, "loop engineering" went viral. Stop prompting your agent — design the loop that prompts it. Ralph Wiggum loops running Claude for hours. Overnight runs. Millions of views.

My honest reaction: finally, everyone's here. I've been running my entire development process as AI loops since this February — months before the trend had a name — and one project is now 70+ loop iterations deep, and the tool that runs those loops is itself built by those loops.

So this is a field report. Not "loops are amazing" (they are) and not "loops are hype" (they're not). Just the seven things that turned out to actually matter once you live inside a loop long enough — including the ones the infinite-loop crowd is about to learn the hard way.

Context: the tool is REAP (https://reap.cc), an open-source pipeline I built on top of Claude Code / OpenCode. It exists because I needed these seven lessons encoded in software, not in my discipline.

First, what the loop people get right

Credit where due — the core insight of loop engineering is correct:

  • One-shot prompting doesn't scale. Iteration beats a perfect mega-prompt every time.
  • Files beat context windows. State that matters must live on disk, not in the conversation.
  • Fresh context each iteration prevents the slow rot of a 400-message session.
  • The leverage moved. Your job really is designing the system around the agent now.

I agree with all of it. Now here's what months of actually living inside the loop adds.

Lesson 1: A goal is not a loop spec

The naive loop is: final goal + while true. It works for tasks where the environment can say "done" — make tests pass, finish a mechanical migration. Even Ralph loop advocates admit it: vague criteria = infinite loop, judgment-heavy work doesn't converge.

But almost everything interesting in software is judgment-heavy. So instead of one goal driving infinite iterations, I got much better results from one bounded goal per iteration, chosen fresh each time by comparing long-term vision against current state (gap analysis). The loop's unit of work in REAP is a generation: one goal, one lifecycle, one review. Then pick the next goal with a human sanity-check in between.

Small bounded loops with re-aiming between them beat one big loop, every single time.

Lesson 2: The loop needs stages, not just repetitions

An unstructured iteration ("here's the goal, go") makes the agent jump straight to code. The fix that stuck: every generation walks a fixed lifecycle —

learning → planning → implementation ⇄ validation → completion
Enter fullscreen mode Exit fullscreen mode

Each stage produces an artifact file (what was learned, what's planned, what was done, what was verified). Sounds bureaucratic. Isn't. Those artifacts are what make iteration N+1 smarter than iteration N — and what make the human checkpoint (next lesson) reviewable in minutes instead of hours.

Lesson 3: The agent must never grade its own homework

This is the hill I'll die on. Every generation ends with a fitness phase where a human gives feedback — and it's deliberately natural language only. No scores, no rubric, no "rate this 1–10", no LLM-as-judge.

Why so strict? Goodhart's law. Any quantitative fitness signal an agent can see is a signal it will optimize instead of the actual goal. I've watched it happen. Self-assessment ("here's what I'm uncertain about") is allowed and useful; self-scoring is banned at the protocol level.

An unattended infinite loop has exactly one grader: the model itself. That's not autonomy — that's compounding hallucination with a progress bar.

Lesson 4: Lock the rules while the loop is running

Give an agent long enough inside a loop and it will, very reasonably, decide the rules should change. The convention was inconvenient, so it "improved" it — mid-task, silently.

REAP's answer is a genome: a small set of files holding architecture decisions, conventions, and hard constraints. During a generation the genome is immutable. The agent can propose changes, but they queue up in a backlog and get applied only at the generation boundary — where I review them. The loop can suggest amendments to its constitution; it cannot ratify them.

Lesson 5: If a stage can be skipped, it will be skipped

Ask any agent to "always run validation before completing" and count the sessions until it… doesn't. Instructions decay. So REAP enforces stage order cryptographically: every stage transition requires a signature token (nonce) that only the previous stage's completion can issue. Skipping validation isn't a disobeyed instruction — it's a failed signature check. The CLI just says no.

Rule of thumb after 70 generations: anything you'd write in ALL CAPS in your prompt should be enforced by the harness instead.

Lesson 6: Autonomy should be a budget, not a binary

The loop-engineering debate keeps framing it as attended vs. unattended. The useful knob is in between: how many iterations am I willing to pre-approve?

REAP calls it cruise mode: reap cruise 3 means "run 3 generations autonomously, then come back to me." Clear, mechanical goals? Crank it up. Ambiguous design territory? Set it to zero and review every generation. Autonomy becomes a dial you turn per-situation, not an ideology.

Lesson 7: Loops need exit ramps, not just exit conditions

Real loops don't always end in success. Sometimes the goal was wrong, sometimes 60% done is worth keeping. An infinite loop has one exit: Ctrl-C, and whatever mess is on disk is your problem.

A loop iteration in REAP has three distinct endings — complete (full lifecycle + review), early-close (keep the partial value, auto-carry unfinished tasks to the next generation's backlog), and abort (discard cleanly, restore consumed state). The ability to lose gracefully is what makes running many loops cheap.

Ralph loop vs. REAP, honestly

Ralph-style infinite loop REAP
Unit of work One prompt, repeated forever One goal per generation, re-aimed each cycle
Memory Files + git, unstructured 3-tier memory + genome + lineage archive
Correction signal Environment (tests/build) only Environment + human fitness each generation
Rules mid-run Agent can drift Genome locked, changes reviewed at boundary
Stage discipline Prompt-based (decays) Signature-enforced (can't skip)
Autonomy All or nothing Budgeted (cruise N)
Failure exit Ctrl-C + cleanup abort / early-close / complete
Best at Mechanical, machine-checkable tasks Sustained product evolution with judgment calls

Not a takedown — for a 200-file mechanical migration with a green-tests exit condition, a Ralph loop is genuinely great. But for evolving a real product over months, you need the right column. That's the gap REAP was built for.

Proof of loop: this tool builds itself

The part I'm proudest of: REAP is developed with REAP. All 70+ generations — the signature locking, the memory system, cruise mode, the evaluator agent — were built inside the exact loop they enforce, each closed with human fitness feedback. Every flaw in the loop design lands on me first, and the fix gets encoded into the genome for every generation after.

Dog-fooding a loop tool inside its own loop is the fastest feedback cycle I've ever worked in.

Try a structured loop (5 minutes)

npm install -g @c-d-cc/reap
cd your-project
reap init        # detects greenfield vs existing codebase
Enter fullscreen mode Exit fullscreen mode

Open Claude Code (or OpenCode) and run:

/reap.evolve
Enter fullscreen mode Exit fullscreen mode

That's one full generation: the agent learns your codebase, plans, implements, validates — and then asks you how it did. Feedback becomes selection pressure. The next generation starts smarter.

If you're running loops today — Ralph-style, cron-based, hand-rolled — I'd love to hear where yours drifted and what you did about it. That's the conversation loop engineering actually needs next.

Top comments (0)