TL;DR: a loop is memory + rules + verify + schedule. Most agent setups have three of the four. The one they skip is a separate verifier, and skipping it means the agent that does the work also grades the work. That is the difference between self-correcting and self-confirming.
This piece is from Code Meet AI, my newsletter where I document how I actually use AI to ship real products, receipts included. New guides land there first.
I run background AI agents on a schedule. For months I was quietly proud of the setup, until I lined it up against what a real loop actually requires and found I had skipped the one part that makes it trustworthy. If you have any agent running unattended, this is the question I had to ask myself, and you should too.
The ladder, fast
Prompt engineering tuned the words. Context engineering got the right things into the window. Harness engineering is the scaffolding around the model: tools, rules, sandboxes (I mapped the harness floor here). Loop engineering is the floor above, the harness running on a timer, spawning helpers, feeding itself. Each layer wrapped the last. Nobody got replaced, the edge just climbed a floor.
A loop has four parts
Memory it reads before acting and writes after. Rules it cannot break. A verifier that confirms the output before it counts. A schedule that triggers the cycle. Memory, rules, verify, schedule.
Where I told on myself
Three indexing layers per project, a curated memory bank, a structural index of what imports what, a temporal recall layer (the full second-brain setup is here). Background agents on a schedule on top.
Count it against the four parts. Memory, yes. Rules, yes. Schedule, yes. That is three. The fourth is verify, and I did not have it.
I never built a separate checker. The loop produced work and then trusted its own work. No second model, no sub-agent whose only job is to read the output and return true or false. The agent that does the work also decides the work is fine. In any human process you would flag that conflict of interest immediately. In an automated one it is somehow the default.
That is the whole point. Memory plus rules plus schedule gives you an agent that runs a lot. It does not give you an agent you can trust to run unattended. Verification is the difference between self-correcting and self-confirming.
What verification actually means
Not the agent re-reading its own output and saying "looks good." A separate checker: a sub-agent or a second model, given the acceptance criteria of the goal, whose only output is true or false against a defined end-state. It does not see the reasoning that produced the work, it sees the work and the spec. That independence is the entire value. If you build one thing this week, build this, because it is what makes the rest safe to leave alone.
The governance angle most people miss
The loop does not know what you are using it for. Two teams build the identical loop and get opposite results. One uses it to move faster on work they understand deeply. The other uses it to avoid understanding the work at all. Same files, same cron line, same checker, opposite outcomes. The responsibility moved up a floor alongside the capability.
So, the one question: who checks the output? If the answer is "the same agent that wrote it," you do not have a loop. You have a head start.
I packaged the skeleton I am using to close my own gap, the memory bank, the rules file, a goal with a verifiable end-state, the checker sub-agent, the schedule, and guardrails so it cannot run off a cliff. It is free, name your price: the Loop Engineering Starter Kit. The deeper write-up, with the diagrams, is on Code Meet AI.
I lean on these loops daily building Wire AI, my generative onboarding SDK for React Native, where unverified agent output would land in front of real users.
I write Code Meet AI: real AI workflows, real numbers, from someone shipping with it every day.
How are you handling verification in your own agent setups? I am still wiring mine in and would take notes from anyone further along.


Top comments (0)