The Hidden Reason AI Botches Your Specs (and the Layered Fix That Works)

#ai #architecture #sdd #fullstack

I adopted Spec-Driven Development expecting the AI to stop making mistakes. It wasn't that simple.

Polished specs, configured skills, a smooth process. Even so, every time I asked the AI to "build a screen," the result came back muddled: business rules ignored, inconsistencies everywhere.

What bothered me is that I couldn't blame the tool or the spec. The method was right. It was just operating at the wrong granularity.

Because "build a screen" isn't a single task. Even when I broke it into multiple tasks, with planning and execution, running the full SDD cycle, it still carried three coupled decisions:

how the interface should look,
which rules the client's domain allows,
and how the API should expose the data.

Three distinct decisions, scattered across tasks but never truly separated. And that's the part that took me a while to see: SDD organized my work, but it didn't decouple my decisions. The model behaved the way it always had, resolving all three at once, with full confidence. It looked correct until the first review, when the inconsistencies, the ignored business rules, and the rework showed up.

The turning point

Those three decisions aren't one. They're independent layers, and they should be decoupled. The presentation layer is one. The domain model is another. The backend is another. You can't resolve all three in the same step.

I think of it like assembling furniture: nobody sands, assembles, and paints at the same time. Each stage has a single objective, and that's what makes it executable.

Out of this came a way of working that has worked remarkably well for me: layered SDD. Each layer is its own spec, with a single objective, that deliberately ignores everything outside its scope.

One important clarification before the workflow: modeling the domain is not building the backend. Domain modeling means describing the client's entities, business rules, and relationships. It's an exercise in understanding, independent of API, persistence, or infrastructure. The backend comes much later.

The workflow I use today, layer by layer

I design the interface. This is the most counterintuitive step, and also the most powerful one. I build a UI flow with sample data, meaning hardcoded placeholder values living in the components themselves, just enough for the interface to render and the UX flow to be visible. No contract, no data layer, nothing structured behind it. This is deliberate: at this step there is no data model yet. I design the best possible interface, as if the current system didn't exist. And yes, I do this knowing I'll refactor later. That's fine. Here I want to capture the ceiling of the UX, not immediate feasibility. And it doesn't have to be the complete frontend in the first prompt: an initial UI is enough to start.
I model the data. I do the opposite: I set the interface aside. I take the data the UI consumes and model it on its own, the entities, business rules, and relationships, fully decoupled from the presentation, without ever touching the UI to do it. I don't think about which field appears where, only about the consistency of the data model itself.
I build the comparison plan. With both layers ready, I put one against the other and map the gaps in both directions: where the UI requires something the data model doesn't support, and where the model exposes something the UI doesn't consume. This is where the real trade-offs become explicit, named, and documented, instead of ambushing me during implementation.
I refactor the UI against the real data model. I rewrite the interface based on what the data model actually supports. In the traditional workflow, this adjustment becomes that end-of-project "cleanup." Here it's a planned step, a design decision, not rework.
I build the mock layer. Now I implement mocks, and here the term is precise: mocks that reproduce the contract and behavior of the real API. Note that this is only possible after step 4, because a mock presupposes that a contract exists to be imitated, and that contract only came into being after the modeling and the comparison. This layer lets me validate complete flows without depending on a backend, iterating, including iterating with AI, in short cycles.
I build the real backend. Only now does the definitive implementation come in, backed by an already validated data model and an already stabilized interface. No discovering requirements halfway through.
I do the E2E integration. I wire the layers together and validate the flow end to end.

The whole process is iterative: an initial UI, an initial data model, a UI v2, a data model v2, and so on. Each pass through the layers is tighter than the last.

Why this works so well with AI

In my experience, the reason is simple: AI models perform much better with small, well-defined problems. When I coupled UX, business rules, and architecture into the same scope, too much ambiguity was left over, and the model filled that ambiguity with assumptions. Decomposing into layers cut that noise drastically: each spec became narrow enough for the model to get it right.

This may sound counterintuitive. The common wisdom is "the more specific context, the better," so why not feed the model everything at once? Because that level of specificity can only come after you have something concrete in hand.

When you're building a feature, you need to see the bottlenecks, and bottlenecks don't reveal themselves up front. They surface when you approach the feature from different angles, build out different flows, and confront them against each other. The layers are exactly that confrontation: each one produces something real, and putting them side by side is what exposes the constraints you would otherwise discover too late.

It seems slower, but it isn't. The method doesn't add new steps, it just moves decisions earlier, decisions that used to come back as rework at the end of the cycle. The "cleanup" doesn't disappear, it becomes part of the plan.

I've been using this approach even with teams at S&P 500 company, and the result has been consistent: more predictability, less chaotic refactoring, and less friction than with traditional Spec-Driven Development.

I don't think it's a silver bullet. But for me, it changed the way I work with AI in software development for good.

How about you? How are you structuring AI-assisted development today?