Ian Johnson

Posted on Apr 7 • Edited on May 1

In-the-Loop to On-the-Loop: How I Stopped Micromanaging My AI Agent

#ai #programming #productivity #softwareengineering

I Was the Bottleneck

For the first two months of this project, I used Claude Code with auto-approve turned off. Every file edit, every terminal command, every change... I reviewed it before it executed.

I read every line. I made inline corrections. I gave real-time direction: "No, use the repository pattern here." "That's the wrong role check." "We use UserFactory::admin(), not User::factory()."

I was pair programming with an AI agent. Except I was the worse kind of pair: the one who grabs the keyboard every 30 seconds.

The results were good. The code was clean. But I was doing most of the thinking and half the typing. The agent was a fancy autocomplete with better suggestions. I wasn't getting the leverage I'd hoped for.

The Realization

I read an article about "on-the-loop" versus "in-the-loop" human-AI collaboration. The framing clicked immediately:

In-the-loop: You're inside the agent's decision cycle. You approve every action. You're a required step in every operation. The agent can't sneeze without your permission.

On-the-loop: You set the direction, define the constraints, and review the output. The agent operates autonomously within those constraints. You step in when something goes off track, not for every keystroke.

In-the-loop is micromanagement. On-the-loop is management.

The problem was obvious: I was micromanaging because I didn't trust the agent to do the right thing. And I didn't trust the agent because there was nothing forcing it to do the right thing.

The Prerequisites

On-the-loop only works if the agent's environment constrains it toward correct output. Without guardrails, autonomy produces slop.

Look at what we'd built over the previous two months:

Guardrail	What It Constrains
2,700+ tests	Behavioral correctness
Pint	PHP code style
Psalm	PHP type safety
Prettier	JS/CSS formatting
ESLint	React/TypeScript patterns
TypeScript	Frontend type safety
Pre-commit hooks	Catches issues before commit
CI pipeline	Final verification gate
Actions pattern	Where business logic lives
Service contracts	How integrations are structured
Policies	How authorization works
Conventional commits	How changes are described
Trunk-based dev	How changes are delivered

Each guardrail narrows the space of "valid output." Together, they create a corridor. The agent can move freely within that corridor, but it can't easily wander off into the weeds.

This is why stages 1–3 came first. You can't go on-the-loop with an agent on a codebase that has no tests, no linting, and no architectural patterns. That's not delegation; that's negligence.

The Switch

I turned on auto-approve for file edits. I started giving Claude higher-level instructions:

Before (in-the-loop):

"Create a new file at app/Actions/Orders/CreateOrderAction.php. Add a constructor that injects NotificationInterface and AnalyticsService. Add an execute method that takes a CreateOrderRequest and User..."

After (on-the-loop):

"Extract the order creation logic from OrdersController::store() into a CreateOrderAction following the existing Action pattern."

The agent looks at the existing Actions. It sees the pattern. It creates the class, the Result DTO, wires up the controller. It runs make lint and make test. Everything passes. I review the diff. It's correct.

I went from dictating code to reviewing code. My throughput doubled. Maybe more. And the code quality stayed the same because the guardrails enforced it.

What On-the-Loop Looks Like Day to Day

A typical session now:

I give direction. "Implement the Guide SPA page. Here's the Jira ticket."
Claude reads the harness. It checks resources/js/spa/CLAUDE.md to understand the SPA patterns. It checks app/Http/Controllers/Api/CLAUDE.md for API conventions. It checks tests/CLAUDE.md for testing patterns.
Claude writes failing tests. Following TDD, it writes the test descriptions and presents them to me.
I review the test specs. This takes 2 minutes. I approve or adjust.
Claude implements. It builds the API endpoint, the React page, the interim wrapper if needed. It runs lint and tests.
I review the diff. This takes 5–10 minutes. I'm reading code, not writing it. I'm looking for architectural missteps, not formatting issues.
If it's good, we commit and push. Claude watches the CI run and alerts me to any failures.

The critical shift: I'm reviewing output, not directing input. I'm checking "did the agent follow the patterns?" not "write this line of code."

Custom Skills: Codifying the Workflow

As the on-the-loop workflow matured, I noticed I was giving Claude the same high-level instructions repeatedly. "Here's a Jira ticket. Read it. Write tests. Implement. Run lint and tests. Open a PR." So I codified these into reusable Claude Code skills: slash commands that encode the full workflow.

/implement-jira-card takes a Jira issue key, pulls the requirements, writes failing tests using TDD, implements the smallest change to make them pass, runs the quality gates, and prepares a PR, all following the harness patterns.

/implement-change does the same thing for ad-hoc changes that don't have a Jira ticket. You describe the change, and the skill drives the TDD workflow: write failing tests, get approval on the test descriptions, implement, verify.

These skills are the on-the-loop workflow made executable. Instead of explaining the process each session, I type a slash command and review the output. The skill encodes the sequence I'd otherwise repeat manually: read the context, write tests first, implement in small steps, run the quality gates, ship it.

The skills don't replace judgment. I still review test descriptions before implementation and review the final diff. But they eliminate the ceremony of setting up each task and ensure the TDD workflow is followed consistently.

The Feedback Protocol

Sometimes the agent gets something wrong. When it does, I don't just fix the instance. I fix the guidance.

This is the feedback protocol:

The agent produces something incorrect or suboptimal
I identify the pattern — is this a one-off mistake, or a gap in the harness?
If it's a gap, I update the relevant CLAUDE.md file — add the rule, the example, the anti-pattern
The harness reloads — Claude re-reads the updated guidance
The correction applies to all future work — not just this instance

Example: Claude was putting notification logic directly in API controllers instead of using the notification service. I added this to the Services harness:

"Do not call the chat webhook or external APIs directly from controllers — use the appropriate service."

It never made that mistake again. Across any controller. The harness is a living document that accumulates lessons learned.

The Agentic Flywheel

The feedback protocol above is manual: I notice a gap, I update the harness, the agent reloads. That's how it started. But the current system goes a step further.

In the project's main CLAUDE.md, two instructions turn the harness from a static document into a self-improving system:

The Change Approval Flow:

Show diff — present all changes and ask for feedback
If feedback given — update the relevant CLAUDE.md harness file to capture the pattern/practice, reload it, then re-apply the changes
If approved — run all pre-commit checks
If checks pass — commit and push

The Feedback Protocol:

All feedback about code quality, patterns, or practices follows this loop:

Update the appropriate CLAUDE.md harness file to capture, define, or refine the pattern
Reload that harness file into context
Re-attempt the change using the updated guidance

Read that carefully. The agent doesn't just follow the harness. It writes to the harness. When I give feedback on a diff, the agent's next action isn't to fix the code. It's to update the harness file that governs that area, reload the updated guidance, and then re-apply the change under the new rules.

This is the difference between a harness and a flywheel.

A harness is guidance the agent reads. I write the rules. The agent follows them. When the rules are wrong or incomplete, I update them manually.

A flywheel is guidance the agent reads and writes. I give feedback. The agent encodes that feedback into the harness. The next task benefits from the updated harness. That task generates new feedback. The harness evolves again.

Feedback → Agent updates harness → Better output → Less feedback → Repeat

Every review cycle makes the next cycle faster. The corrections compound. Early in the project, I was giving feedback on almost every diff: "use the service, not a direct call," "that's the wrong factory method," "authorization goes in the policy, not the controller." Each correction became a harness rule. Each rule eliminated a class of future mistakes.

Three months in, most diffs need zero feedback. The flywheel has accumulated enough guidance that the agent produces correct output on the first pass for the majority of tasks. My reviews shifted from "this is wrong, fix it" to "approved."

The flywheel has a second-order effect: it forces me to give precise, pattern-level feedback instead of one-off corrections. "Fix this line" doesn't help the harness. "Notification logic belongs in the notification service, not controllers" does. The mechanism shapes the feedback toward reusable rules, which is exactly what you want.

This is also why the harness is distributed across multiple CLAUDE.md files: one per architectural boundary (Actions, Services, Controllers, Tests, SPA, etc.). When the agent updates the harness, it updates the specific file governing that area. The feedback lands where future work will read it.

The flywheel isn't magic. It requires two things: feedback that's worth encoding, and an engineer who reviews diffs carefully enough to generate that feedback. But given those inputs, the system gets better automatically. The agent does the mechanical work of updating documentation, reloading context, and re-applying changes. You just say what's wrong.

The Trust Equation

On-the-loop trust has a formula:

Trust = Tests + Linting + CI + Architectural Patterns + Harness Guidance

If any component is missing, trust drops:

No tests? You can't verify the agent's output is correct.
No linting? The output might be inconsistent or buggy in ways tests don't catch.
No CI? You're trusting local runs that might not match production.
No patterns? The agent invents its own, and they'll be inconsistent.
No harness? The agent doesn't know your conventions.

All five together? You can hand the agent a Jira ticket and review the PR an hour later.

The Curator Mindset

My role shifted from writer to curator. I don't write most of the code anymore. I:

Define the patterns — through architecture and harness files
Review the test specs — through TDD red-phase tests
Review the output — diffs, not keystrokes
Update the harness — when the agent drifts or when patterns evolve
Make strategic decisions — what to build, in what order, with what tradeoffs

Here's the thing I didn't anticipate: building the harness forced me to articulate my own preferences. Why do I prefer constructor injection? Why Result DTOs instead of returning models? Why one execute() method per Action? I had reasons for all of these — years of experience, books I'd read, mistakes I'd made — but they lived in my head. The harness made them explicit.

This is a different kind of engineering. It's more like managing a team than writing code solo. You're responsible for the quality of the output, but you're not doing the mechanical work. You're the curator of design, guardrails, and documentation.

It's also more fun. I spend my time on the interesting problems (architecture, domain logic, strategic decisions) and let the agent handle the implementation details that follow established patterns.

The Results

I don't have hard before/after metrics (this isn't an A/B test), but the trajectory is clear:

258 commits in ~3 months — roughly 3 commits per day
145 PRs merged — consistent, steady output
2,700+ PHP tests, growing Vitest suite — quality gates that hold
6 major features migrated to React SPA — shipped to production via interim wrappers
17 refactoring commits — continuous architectural improvement
Zero big-bang rewrites — incremental progress throughout

The codebase went from "legacy monolith with no tests" to "well-structured application with automated quality gates and a dual-frontend migration in progress." In three months. With one engineer and an AI agent.

Infrastructure Is a Guardrail Too

It's easy to think of guardrails as code quality tools — tests, linters, static analysis. But the infrastructure decisions are guardrails in their own right.

Docker ensures every environment is identical. The Makefile provides a single interface for every operation. Redis-backed queues isolate background jobs (CRM sync, notifications, calculations) from the request cycle. Separate queue names mean a third-party API outage doesn't back up critical notifications.

The agent doesn't need to know how Docker networking works or why the CRM sync runs on its own queue. It just needs make test to pass and make lint to be clean. The infrastructure absorbs complexity so the agent (and you) don't have to think about it.

This is "pull complexity downward" in action. Simple interfaces. Complex implementations hidden behind them.

What Could Go Wrong

On-the-loop isn't a silver bullet. There are failure modes:

The harness is wrong. If your CLAUDE.md files encode bad patterns, the agent will faithfully reproduce bad patterns. The harness is as good as your engineering judgment.

The tests don't cover the right things. If your tests verify implementation details instead of behavior, the agent can pass all tests while doing the wrong thing.

You stop reviewing. On-the-loop doesn't mean no-loop. You still review diffs. You still verify the output makes sense. You just do it at a higher level.

You skip the prerequisites. If you try to go on-the-loop without tests, linting, and CI, you'll get fast slop instead of slow slop. Still slop.

The Takeaway

The path from in-the-loop to on-the-loop:

Build the guardrails first. Tests, linting, CI, clear architecture.
Establish patterns. Actions, Services, Policies — consistent, repeatable, verifiable.
Create the harness. CLAUDE.md files that encode your conventions.
Start delegating. Give higher-level instructions. Review output, not input.
Update the harness continuously. Every correction is a lesson the harness absorbs.
Trust the system, not the agent. You trust the guardrails. The agent just happens to work within them.

The agent didn't get smarter. The environment got smarter. That's the difference.

Resources

I decided to experiment with implementing different methods of a harness. Prior to this experiment I had read a few blog posts that introduced me to the idea. Additionally, autonomy over a project gave me a lot of power to move fast. Here are some links from resources that inspired me:

Please check them out!

DEV Community