Ashley Childress

Posted on Jun 30

The Spec Was Never the Good Part

#ai #discuss #programming #productivity

Design as dialogue over batch thinking

🦄 It's been a while since I've written anything here—mostly because a topic will cross my mind and bore me before I ever finish it. So this one I handed to Claude as a test to see whether the workflow that carries my code-planning holds up in the writing phase too. Spoiler: if you're reading this, that means it already worked.

We hired a thought partner and handed it a punch list 🪧

Here's the workflow we all agreed was the grown-up one:

Write a spec.
Hand it to the agent.
Let it build against the doc.
Review the diff when it's done.

Spec Kit scaffolds the whole thing for you, and Kiro builds its entire flow around it—type a prompt, get a spec.md, get a plan, get code. It's clean and traceable, it looks like the opposite of vibe-coding, and that's exactly why it's so easy to sell.

The problem isn't specs. The problem is batch thinking cosplaying as design.

That kind of thinking treats the model like a vending machine: you punch in a spec, a feature drops out the bottom, and the only thing left to wonder is whether you pressed the right buttons. But that's not where the model is actually good. It's good earlier, back when the idea is still fuzzy and half-formed, and you hand it over so AI can start pulling the idea apart with you. It points out the case you didn't think of, or tells you which of your three plans is going to bite you later.

We took a tool that's genuinely good at reasoning and put it to work typing.

The step we skipped 🪜

There's a step between "I have a problem" and "build this feature," and it's a conversation. A conversation in real time with something that pushes back against your original thought. That's where you actually find the problems: the null input, the "wait, what happens if two of these fire at once" that you rarely think to ask when it's still cheap enough to fix. Skip that step and you've skipped the part that mattered most.

A notification setting sounds simple until quiet hours, account-level defaults, per-project overrides, and "send me critical alerts anyway" all disagree—and the generated spec quietly crowns one of them king.

That's exactly what generating a spec does when you treat it as the thinking instead of the record of thought. It writes everything down in one shot—before you've hit any of the hard parts—and then everyone treats the thinking as done because there's a shiny new file that says it is. Planning in chat forces you to argue. A generated spec just takes dictation.

And sure—a human arguing with you is better. But an AI that pushes back beats a generated document that just nods happily in Markdown.

Where the skip shows up 🪞

A model doesn't decide fifty points independently. It commits to a path up front and then writes everything after it to match. So by the time that spec doc gets to you for review, one wrong assumption near the top has already worked its way through everything under it, and you're not really reviewing fifty decisions—you're reviewing one decision, fifty times over.

In chat, you hit the forks one at a time, in real time. The model picks a direction early, you watch it head somewhere dumb, and you redirect right there—before the next ten decisions get built on top of the wrong one. It's the same call you'd eventually catch in a review, except now it's the only thing in front of you instead of point three of fifty you skimmed past.

A batch spec bakes the error in. A conversation corrects it at the branch.

It won't fight you unless you make it 🥊

I almost left this part out, because it's already baked into every one of my chats—so much a default for me that it didn't occur to me anyone needed it spelled out. But that instinct isn't free. Left to its defaults, every model is a yes-machine. It'll happily validate your worst idea and build a beautiful spec around it, because agreeing is easier than arguing. And you don't get an opponent by accident. You get one on purpose.

I figured my setup already had it covered. My CLAUDE.md has a line I've been quietly proud of:

Push back when wrong. Collaborator, not yes-machine.

Then I actually went back and read it, and every rule in that file was reactive—push back when wrong, be loud when you know you're right. All of it only fires once there's already a wrong answer sitting there to argue against. Nothing in there told the model to fight me while the design was still up in the air, while the plan hadn't failed yet because it hadn't even been built. The habit lived in how I actually work, not in anything I'd written down—I'd been taking credit for a default I never put in the file.

So I wrote it—one bounded adversarial rule:

## Adversarial Thinking

- Role during planning: opponent, not stenographer. Challenge undecided designs before implementation, not after failure.
- Raise one objection at a time. Select the highest-risk assumption—the one that invalidates the most if wrong. Never enumerate objections.
- TRIGGER on: edge cases, irreversible or high-cost choices, hidden coupling, ambiguous or underspecified requirements. SKIP: trivia, low-cost reversible choices, settled matters of taste.
- After raising an objection, wait for the user's response before raising the next. Treat each answer as input that updates the plan.
- STOP when the user makes a decision and names the tradeoff. Do not reopen a settled decision.
- EXCEPTION to STOP: if a new decision reverts or contradicts an earlier settled one, flag the conflict explicitly before continuing.
- Do not produce objections to signal rigor. Do not bikeshed. Do not default to disagreement.

The bound is the part that keeps it usable, because without a stop condition "adversarial" just turns into "exhausting," and a model that re-litigates every settled call is about as useless as one that agrees with everything. Same problem, worse mood.

Think first, ship anyway 🪶

Specs are contracts. They just aren't always the right place to do the thinking.

This is a single-player argument. One developer who owns the design, holding the whole problem in their own head and a single conversation. It's not a twelve-service migration or a four-team handoff, where the doc exists because no single brain can hold the whole thing and people need one place to hash it out.

For the work that does fit inside a conversation, though, the fix isn't to ban specs or write more of them—it's to put the argument back in front:

Use the model for the part it's actually good at.
Fight the design out before you name the feature.
Let the spec fall out as a byproduct, not as a stand-in for the thinking it was supposed to capture.

The discipline a generated spec.md is supposed to buy you? You get it for free just by refusing to let the model agree with you too early.

Which leaves the one question I don't actually have an answer for, so I'll hand it to you instead of faking one: how does this scale past a single person? At team size the spec isn't just a build target—it's the thing everybody who wasn't in the chat still has to agree on. I know how to make the model fight me. I haven't figured out how to make a conversation do the job of a contract. If you've cracked that part, I want to hear about it.

Because the good part was never in a tidy document. It was the conversation we had along the way.

🛡️ Argued Into Existence

This post got pressure-tested by the exact setup it argues for—an AI I told, in writing, to stop agreeing with me, and then watched actually do it. It caught two weak points, argued over the shape, and then had the nerve to help write the disclaimer. Rude, but useful. Most of the words are AI, but the opinions are mine.

Top comments (14)

meow.hair • Jul 3

Great article, Ashley! I really appreciate your perspective on using AI as a "thought partner" rather than a vending machine. It’s a crucial distinction that many overlook. Don't let the noise distract you; your insights on system design and AI workflows are highly valuable. Keep up the excellent work!
🍃🧊🤍🌊

Daniel Nwaneri • Jun 30

Ashley, "you're not reviewing 50 decisions, you're reviewing one decision, fifty times over" is exactly the failure mode spec-writer was built to interrupt. the skill forces the adversarial conversation into the spec generation step itself . it won't produce a task breakdown until ambiguity gets resolved out loud first.

your scaling question is the real one tho. single-player adversarial thinking works because you're the only person who has to live with the tradeoff. at team size, the conversation has to produce something a person who wasn't in the room can audit later not just a clean spec but a record of which objections got raised and why they got resolved the way they did. the spec becomes less "here's what to build" and more "here's the fight that produced this, in case you need to reopen it."

haven't fully solved it either. but I think the answer is closer to a decision log than a better spec format.

Ashley Childress • Jun 30

Honestly, a decision log is a much closer right answer—it's basically the ADRs I've been using some. A product spec was never meant to be the only document you work from, so it was never going to fully solve the AI problem either. As long as the argument keeps forcing the kind of clear thinking AI won't do on its own, then you're already ahead.

Daniel Nwaneri • Jun 30

Ashley, ADRs is the right name for it. didn't think to call it that but that's exactly the shape . a record of the argument, not just the outcome.

Mykola Kondratiuk • Jul 4

disagree - the spec is the good part. that’s where you find out what you actually want. hand it off and you skip the real thinking. the agent gets a list, you get an output you didn’t earn.

Ashley Childress • Jul 4

hand it off and you skip the real thinking.
You just summed up this post in one sentence 😆

Mykola Kondratiuk • Jul 4

haha appreciate it - that line came out of losing a sprint to a spec that looked done but wasn't. sometimes the lesson just writes itself.

mote • Jul 5

The "batch spec bakes the error in, a conversation corrects it at the branch" line is the real insight here. It maps directly to how version control works â you can write a perfect commit message after the fact, but the commit history already records the actual decision tree. The spec-as-dictation problem is that you're generating the commit message before the actual decisions have been made.

This is also why agent planning tools that output a fully-specified spec before any code is written tend to produce overconfident documents. The model has committed to a path, and now everything downstream bends to justify it. That's not a failure of the model â it's a structural problem with treating generation as planning.

The CLAUDE.md reactive-vs-bounded rule distinction is sharp. Reactive rules ("push back when wrong") only fire after a mistake exists. Bounded adversarial rules ("fight me while the design is still up in the air") operate before the cost of being wrong is baked in. That difference is the difference between a collaborator and a very articulate rubber duck.

One thing I keep running into: how do you give an agent enough adversarial structure to push back meaningfully without making it so adversarial that every conversation becomes a negotiation? There's a calibration problem there that doesn't seem to have a clean answer yet.

Ashley Childress • Jul 5

I agree completely with your calibration point and I run into that same thing, too. I think it has more to do with the agents' system instructions that force it to be a "helpful assistant". Overwriting the big AI agents default coding direction is more difficult than if those ideas went straight to the LLM. The problem is that it's so good at everything else it's hard to justify building one just for planning. At least that's a project I have not seriously considered yet. I keep hoping providers will drop that line from the list instead. 😆

Theo Valmis • Jul 2

Provocative title, and there's a real point under it. The spec was never the deliverable, it was a forcing function to make you think before you built. What changed is that agents can now execute a vague spec instantly, which strips out the friction that used to expose the holes. So the spec matters more, not less, but for a different reason: it's now the main place you get to be wrong cheaply, before the agent turns the ambiguity into a thousand lines you have to unwind. The good part was never the document, it was being forced to decide.

Sloan the DEV Moderator • Jun 30

Hey, this article appears to have been generated with the assistance of ChatGPT or possibly some other AI tool.

We allow our community members to use AI assistance when writing articles as long as they abide by our guidelines. Please review the guidelines and edit your post to add a disclaimer.

Failure to follow these guidelines could result in DEV admin lowering the score of your post, making it less visible to the rest of the community. Or, if upon review we find this post to be particularly harmful, we may decide to unpublish it completely.

We hope you understand and take care to follow our guidelines going forward!

Ashley Childress • Jun 30

Sloan is missing the attribution 😆 AI generated that, too!

Edu Peralta • Jul 4

This matches what I've seen building a tool that runs coding agents side by side. The design conversation is where the good decisions get made, but I've watched an agent take a chat that landed on validate at the boundary, trust internal calls, and still add a defensive check three layers deep because the token window forgot the earlier argument. The spec at least gives you a paper trail to check the diff against, even if it wasn't where the thinking happened. I don't think that makes spec first right, just that the conversation needs to survive contact with the actual generated code, not just end when everyone nods.

Ashley Childress • Jul 5

I agree. I'm not saying the spec isn't important and it should always be the document that comes out of the chat. What I'm trying to say is don't skip the back-and-forth design part of the chat when you do it. Because the agent will absolutely take an idea and run with it to output a document that satisfies the ask using the path of least resistance. The more you interrupt that path, the better the spec becomes.

View full discussion (14 comments)