Thesis: You can use AI coding agents on real production codebases and get predictable, high-quality results if you treat the agent like a junior engineer who needs guardrails, not a magic wand.
Audience: Mid-level and senior software engineers curious about AI-assisted development but skeptical of the hype.
Series Arc
The story follows a real Laravel + React codebase over ~3 months and ~258 commits from a legacy monolith with no tests to a well-structured application with automated quality gates, a React SPA migration in progress, and an AI agent that reliably ships production code with minimal supervision.
Stage 1: The Foundation — Tests First, Everything Else Second
Post 1: "Before You Let an Agent Touch Your Code, Write the Tests"
Why tests are the prerequisite for AI-assisted development, not an afterthought. How wrapping a legacy codebase in characterization tests creates the safety net that makes everything else possible. Covers the move from SQLite to MySQL in tests, the UserFactory pattern, and why test infrastructure is the highest-leverage investment.
- The codebase before: no tests, no linting, no CI
- Why tests come first (they're the reins, not the saddle)
- Characterization tests for legacy code
- The
UserFactoryfacade pattern - TDD as a communication protocol with agents
Post 2: "Linting, Static Analysis, and the Pre-Commit Hook That Saved My Sanity"
Adding Pint, Psalm, Prettier, ESLint, and pre-commit hooks, not as busywork, but as machine-checkable standards the agent can verify its own work against. Why "make lint" is more important than code review when an agent is writing the code.
- The tooling stack: Pint, Psalm, Prettier, ESLint, TypeScript
- Pre-commit hooks as automated code review
- CI as the final gate
- Why agents need checkable standards, not style guides
- The compound effect: each tool narrows the failure space
Stage 2: The Refactoring — Making Change Safe
Post 3: "Traits to Services: Refactoring for Testability (and for Agents)"
How extracting six traits into service classes with contracts created clean boundaries the agent could work within. The strategy: plan all six, execute one at a time, keep the app running in production throughout.
- The trait problem: global state, hidden dependencies, untestable
- Contract-first design: interfaces before implementations
- The extraction sequence (Chat, CRM, OCR, Document Conversion, External API, Calculator)
- Each extraction behind enough abstraction to keep production running
- Why clear boundaries help agents more than documentation
Post 4: "Actions, Policies, and the Art of Obvious Code"
Extracting controller logic into Action classes with Result DTOs, and adding Laravel Policies for authorization. Why making the architecture explicit and boring is the best thing you can do for an AI agent.
- Fat controllers → thin controllers + Actions
- The Action pattern: one
execute(), one Result DTO - Laravel Policies for authorization (replacing inline checks)
- Role-scoped query builders
- How this accidentally created the perfect migration bridge (web → API)
Stage 3: The Migration — Two Frontends, One Source of Truth
Post 5: "No Big-Bang Rewrites: Running Two Frontends Without Losing Your Mind"
The strategy for migrating from Blade to React without ever stopping feature delivery. The two-frontend architecture, environment gating, and the interim wrapper pattern that lets SPA pages ship to production inside legacy Blade shells.
- Why big-bang rewrites fail (and what to do instead)
- The two-path architecture: Legacy (Blade) + SPA (React)
- Environment gating: SPA only in local/staging/test
- The Interim wrapper pattern: SPA components inside Blade shells
- Feature flags and analytics for gradual rollout
Post 6: "Trunk-Based Development with Short-Lived Branches"
How trunk-based development, conventional commits, small PRs, and CI/CD create the fast feedback loop that makes AI-assisted development practical. Why the branch should live for hours, not days.
- Trunk-based development: why and how
- Conventional commits as a communication protocol
- The CI pipeline: build → lint → test → deploy
- Forge deployment via webhook
- Small batches, continuous integration, always releasable
Stage 4: The Harness — From Micromanaging to Managing
Post 7: "In-the-Loop to On-the-Loop: How I Stopped Micromanaging My AI Agent"
The turning point. Moving from approving every line to trusting the guardrails. What "on-the-loop" means in practice, why it requires everything from stages 1–3, and how the CLAUDE.md harness files made it work.
- In-the-loop: reviewing every line, giving real-time direction
- The frustration: micromanaging defeats the purpose
- On-the-loop: setting direction, reviewing output, trusting guardrails
- Why this only works with tests + linting + CI + clear architecture
- The feedback loop: harness → agent output → review → update harness
Post 8: "Building the Agent Harness: Subdirectory CLAUDE.md Files"
The technical deep-dive on the harness system. Why one big instruction file doesn't work, how subdirectory CLAUDE.md files control context loading, and what goes in each one. Includes real examples from the codebase.
- The context window problem: one big file blows up
- Subdirectory
CLAUDE.mdfiles: lazy-loaded, scoped guidance - What each harness file covers (Actions, Services, Tests, SPA, etc.)
- The harness table: mapping areas to guidance files
- The feedback protocol: update the harness, reload, re-apply
- How the harness checks its own work (
make lint,make test,make test-js)
Post 9: "The Curator's Role: Managing a Codebase With an Agent"
The philosophical methodology. Your role shifts from writing every line to curating the repository and the agent. How Modern Software Engineering principles (Dave Farley) map perfectly to AI-assisted development. Why every project's harness is different because you're codifying your judgment. The engineer's role in the age of agents.
- Simplicity wins: nine Markdown files, not a custom AI platform
- Guardrails first — the case for doing this work up front
- You're codifying yourself: the harness is your preferences made explicit
- Every project is different: why generic AI advice falls short
- The engineer's three roles: curator of design, guardrails, and documentation
- Modern Software Engineering: optimize for feedback, work in small batches, empiricism over dogma
- The harness as a living document (it evolves with every review)
- Results: velocity, quality, confidence
Post 10: "Custom Skills: The End-to-End Workflow Made Executable"
How two custom slash commands turned a repeatable workflow into a consistent, end-to-end process — from Jira ticket to merged PR, with TDD and harness feedback at every step. Why automating the ceremony lets you focus on the judgment calls.
- The repetition problem: typing the same instructions every session
- Skills as Markdown files: plain English workflows in
.claude/skills/ - Two skills:
/implement-jira-card(from Jira) and/implement-change(ad-hoc) - The eight-phase workflow: scope → requirements → plan → branch → TDD → commit → CI review → refactor
- Eight feedback checkpoints: on-the-loop made concrete and repeatable
- Harness feedback built into every checkpoint
- TDD as a structural phase, not a preference
- Separating ceremony from judgment: automate the sequence, keep the decisions
- Structure scales, discipline doesn't
Gist Strategy
Each post links to 1–3 GitHub gists showing real code from the project:
- Harness files (
CLAUDE.mdexamples) - Action class + Result DTO pattern
- Service contract + implementation
- Test patterns (
UserFactory, API tests) - CI workflow excerpts
- Interim wrapper component
- Pre-commit hook configuration
- Makefile targets
Top comments (0)