$0.08 and 3,500 Lines: The Complete Failure of a Deterministic Agent Harness

#ai #agents #engineering #productivity

I have a theory about why agent suggestions land so heavy.

It's not that the suggestions are good. Half of them are terrible — wrong approach, wrong abstraction, wrong thing to build entirely. But they have impact. A colleague says "maybe we should add a reconciliation layer" and you nod and continue. An agent says "a reconciliation layer with idempotency keys and a ledger table" and you're already opening a new repo.

Same suggestion. Different weight.

The confidence gap

Agents don't hedge. They don't trail off. They don't say "I'm not sure about this but." They say "here's what you should build" with the same certainty they use to say "the sky is blue."

A person who sounds that confident is either an expert or a fool. We know which one we're listening to. An agent that sounds that confident could be either — and that ambiguity works in its favour. The articulate wrong answer beats the hesitant right one every time.

Speed as a trap

The agent gave you a file structure, three migration scripts, and a Dockerfile in one response. You can push a PR in ten minutes. The cost of trying is zero.

Except it's not zero. It's attention. It's context. It's the next six weeks of your life explaining to people why this exists. It's the maintenance load of something nobody asked for that now has tests, docs, and a deployment pipeline because it was easier to ship than to evaluate.

The cost of building is never zero. But the agent made the first 80% free, and that's the part your brain sees.

Responsibility diffusion

When you build from your own idea and it fails, that's on you. The whole thing. The meeting where you pitch it, the architecture you chose, the reason it existed.

When you build from the agent's idea and it fails, the failure feels different. You were just executing. The agent suggested it. You followed the recommendation. It's not your fault the recommendation was bad.

This is a lie your brain tells itself to avoid the sting. But it's a convenient lie, and we take it.

Why it keeps happening

Because there's no feedback loop.

The agent doesn't know you deleted its code. It doesn't know that thing you built has zero users. It doesn't feel the maintenance. Next session, when the context is fresh, it will suggest the same pattern again — because the pattern is correct in theory. The theory never experiences reality.

So every fresh session is a new pitch. The same idea, the same confident delivery, the same "this looks right" feeling. The agent doesn't learn. The veto gate has to be you — and we're bad at vetoing confident-sounding things.

What I do now

I started asking questions. Who needs this? Has anyone asked for it? What if I do nothing?

It's not a perfect filter. But it catches the worst ones — the ones where the agent was so articulate that I almost forgot to check whether I wanted the thing.

DEV Community