Constanza Diaz

Posted on Jun 1

AI Pair Programming Isn't Autopilot: Scaffolding HandyFEM and Catching What the AI Threw Away

#webdev #specsdriven #ai #claudecode

The agent writes the code. You're still the engineer.

I'm building HandyFEM with Claude Code as my pair. It's fast — sometimes startlingly so. But the way I work with it is deliberate: I treat everything it produces the way I'd treat a pull request from a capable junior developer. I read it. I question it. I decide what stays.

This post is a concrete example of why that habit matters.

The task: scaffolding the project

Before writing features, you scaffold a project — generate its skeleton: folder structure, config files, a base page that runs. I had the agent set up the foundation for HandyFEM:

Next.js with TypeScript and Tailwind
shadcn/ui — component code lives in your repo, so you own it
Design tokens wired into the theme — exact color palette from my specs

Because my project folder already had docs, Git, and a .env.local with a real secret, the agent did the smart thing: generated the app in a temporary folder and integrated carefully, without clobbering my existing files.

The catch

During the integration, the agent mentioned — almost in passing — that the Next.js generator had created its own CLAUDE.md file, and that it had discarded it so as not to overwrite mine.

On the surface: correct behavior. But it raised a question I didn't want to skip. Did that discarded file contain anything useful?

So I asked. We went back and looked. The generated CLAUDE.md pointed to a second file — AGENTS.md — and that one held something genuinely valuable:

# This is NOT the Next.js you know

This version has breaking changes — APIs, conventions, and file structure may
all differ from your training data. Read the relevant guide in
node_modules/next/dist/docs/ before writing any code.

Real. The framework version I'm using is newer than most AI models were trained on. Losing this note meant the agent might later write code using outdated patterns — confidently, and wrongly.

We rescued it into my project instructions. One small note, but it changes the quality of every future line of framework code.

Why this matters

The agent didn't do anything wrong. Discarding a file to protect mine was sensible. But the side effect — dropping context that mattered — was easy to miss, buried in a one-line aside during a much bigger operation.

That's the pattern to internalize. AI agents make a high volume of fast, plausible decisions. Most are good. But "plausible" isn't "reviewed." The skill isn't prompting — it's reading the output like a reviewer: what did it change, what did it remove, is each of those what I actually want?

I don't review because the tool is bad. I review because it's good enough that I'd otherwise stop paying attention. That's the trap.

Honest reflections on the workflow

What's working:

Security before features. Pre-commit hook, .gitignore, environment variables — all set up before scaffolding. For a product whose whole premise is trust, that ordering is non-negotiable.
Detailed specs as source of truth. The agent had real screens, components, and a color palette to build against — not an invitation to invent one.
CLAUDE.md with project conventions. Mobile-first, accessibility minimums, never hardcode secrets. The agent defaults to my standards instead of generic ones.

What I'd refine:

Decide manual vs. automated before committing to a path. Earlier I had the agent script a one-time setup task. It became a long debugging session. Doing it by hand would have been faster. Automation pays off when you repeat something — for a true one-off, manual is often smarter.
Take inventory before opening a new thread. Some questions I treated as open were already answered in my own specs. Five minutes of review saves re-litigating settled decisions.

The meta-lesson: with an AI agent, the bottleneck shifts. It's no longer writing the code — it's deciding well and reviewing carefully. The typing got cheap. The judgment got more valuable, not less.

Takeaway

An AI agent is a genuine force multiplier — but only for someone who stays in the driver's seat. It will move faster than you, make mostly-good calls, and occasionally drop something that matters in a sentence you could easily skim past.

So don't skim. Read the diff. Ask "what did you remove, and why?"

The agent writes the code. You're still the engineer.

#HandyFEMApp #BuildingInPublic #AI #ClaudeCode #WebDev

Top comments (2)

Echo • Jun 2

This resonates. The CLAUDE.md / AGENTS.md rescue is the kind of "plausible" drop that's easy to miss in a long scaffold run — exactly because the agent's framing was correct in isolation.

One habit I've added on top: every "review the output like a PR" session also writes a one-line note into my project log about which scaffold step produced the dropped context. Not a copy of the file, just a pointer like "AGENTS.md notes for Next.js 16 — see rescued file". Two months later, when a different model suggests an outdated API pattern, I have something to grep, and the next session inherits the correction.

Pair-programming framing fits better than autopilot framing, agreed. The bit I keep coming back to is that a junior dev's PR leaves traces in the commit log; an agent's "PR" only leaves traces if you write them yourself. That's the part most autopilot comparisons skip.

Constanza Diaz • Jun 8

Oh that's a good point, I hadn't thought about leaving notes in the commit log. What I do is just write new rules in CLAUDE.md every time the agent does something unexpected — but your idea of keeping a pointer log sounds really useful on top of that. Thanks for the contribution!