DEV Community

Cover image for Why Spec-Driven Development Fails— And a Better Way to Structure AI Development
Hichoi-Dev
Hichoi-Dev

Posted on

Why Spec-Driven Development Fails— And a Better Way to Structure AI Development

Spec-Driven Development (SDD) promised a revolution: write detailed specifications, hand them to AI, and watch working software appear. It's a compelling vision. But after months of real-world usage, the community's verdict is increasingly clear — SDD's core assumption is flawed.

That doesn't mean it was a waste. SDD opened an important conversation about how we should structure AI-assisted development. But the answer isn't what SDD proposed.

What SDD Gets Right

Let's give credit where it's due.

SDD recognized a real problem: "prompt and pray" development doesn't scale. When you're building anything beyond a toy project, you need structure. You need context. You need a way to communicate intent to an AI agent that goes beyond "build me an auth system." This means, you need workflow for entire development process.

GitHub's Spec Kit formalized this insight into a practical workflow: Specify → Plan → Task → Implement. The idea of giving AI agents structured context rather than ad-hoc prompts was genuinely valuable. The GitHub blog post introducing Spec Kit sparked important discussions across the industry.

SDD's contribution to AI-assisted development is real — it established that structure matters.

The Core Problem: Specs Are Non-Deterministic

Here's where SDD breaks down fundamentally.

SDD treats specifications as "the single source of truth" — the authoritative document from which AI generates code. But this assumption collides with a basic reality of LLMs: the mapping from spec to code is non-deterministic.

The same specification, given to the same model on different days, produces different implementations. Different architectural choices. Different data structures. Different error handling patterns. As Martin Fowler's team observed, "Because of the non-deterministic nature of this technology, there will always remain a very non-negligible probability that it does things that we don't want."

This means a specification cannot be a source of truth in the way source code can. Source code is deterministic — compile it twice, you get the same binary. A spec is a suggestion. A well-structured, detailed suggestion, but a suggestion nonetheless.

The Waterfall Redux

If the spec-to-code mapping is non-deterministic, what does SDD actually become in practice?

It becomes Waterfall.

Think about it:

  1. Big Design Up Front: Write exhaustive specifications before any code exists
  2. Sequential phases: Spec → Plan → Tasks → Implementation, each phase completing before the next begins
  3. Assumption that planning eliminates uncertainty: If we document everything thoroughly enough, execution becomes mechanical

We learned decades ago that this doesn't work. The Agile Manifesto wasn't a theoretical exercise — it was a response to real project failures caused by exactly this approach. Software development is fundamentally a process of discovery through building. Requirements change. Assumptions prove wrong. Edge cases emerge only during implementation.

Colin Eberhardt's hands-on test of Spec Kit put numbers to this problem: 33 minutes and 2,577 lines of markdown to produce 689 lines of code — versus 8 minutes and 1,000 lines using iterative prompting. That's roughly 10x slower with no improvement in code quality.

The spec contained a "trivial and clumsy mistake" (a missing variable population) despite the detailed specification process supposedly preventing exactly that kind of bug. The author's conclusion: SDD is "an interesting concept" but "not a viable process, at least not in its purest form."

Why Specifications Drift

The deeper issue is that specifications and code inevitably diverge.

In SDD, you're supposed to update the spec when reality doesn't match the plan. But in practice:

  • Spec drift: AI makes architectural choices not anticipated by the spec
  • Implicit context: Each implementation iteration accumulates undocumented decisions
  • Reactive documentation: Specs become records of what happened rather than guides for what should happen
  • Double review burden: You review the spec, then review the code, then check they match

As the Marmelab analysis noted, developers end up "spending most of their time reading long Markdown files" rather than solving actual problems. The overhead doesn't pay for itself.

The Real Question

SDD asked the right question: "How should we structure AI-assisted development?"

But it gave the wrong answer: "With exhaustive upfront specifications."

The right answer, I believe, is the same answer the software industry arrived at decades ago: iterative development with accumulated learning.

Instead of treating specifications as the source of truth, treat evolved knowledge as the source of truth. Instead of planning everything upfront, plan one iteration at a time. Instead of assuming AI will perfectly execute a spec, build feedback loops that capture what AI actually does and feed it back into the next iteration.

The pattern that works isn't Waterfall with AI — it's Agile with AI.

A Different Approach

This is what motivated me to build REAP (Recursive Evolutionary Autonomous Pipeline). Rather than treating development as a spec-to-code translation, REAP structures AI-assisted development as an evolutionary process — closer to how experienced developers actually work.

How REAP Works

Development happens in Generations. Each generation carries one focused goal through a 5-stage lifecycle:

Objective → Planning → Implementation → Validation → Completion
Enter fullscreen mode Exit fullscreen mode

This isn't just a linear pipeline. Each stage has gates, and stages can regress — if validation fails, you loop back to implementation with the failure context preserved. This mirrors the real-world "build → test → fix → test again" cycle that SDD's sequential model ignores.

The Genome: Knowledge That Evolves

Where SDD puts specifications at the center, REAP puts a Genome at the center — a living record stored in .reap/genome/:

  • principles.md — Architecture decisions with rationale (ADR-style)
  • conventions.md — Development rules and enforced standards
  • constraints.md — Technical choices and validation commands
  • domain/ — Business rules that can't be derived from code

The Genome isn't written once and forgotten. It evolves across generations. When you discover something during implementation that contradicts the Genome, you log it as a backlog item. At the end of each generation, discoveries are reviewed and the Genome is updated. Over time, the Genome becomes an increasingly accurate map of your project — not a spec that drifts from reality.

What Makes It Different from SDD

SDD REAP
Source of truth Specification document Evolved Genome + source code
Planning scope Entire project upfront One generation at a time
When plans break Spec drift → update spec → regenerate Discovery → backlog → evolve Genome
Validation Spec compliance Actual tests, type checks, builds
Knowledge persistence Specs (static) Genome (evolving) + Lineage (history)
Context for AI Spec document Genome + generation state (auto-injected)

Context That Persists

Every time you start an AI session in a REAP project, the SessionStart hook automatically injects the Genome, current generation state, and workflow rules into the AI's context. The AI doesn't start from zero — it starts with your project's accumulated knowledge.

This solves SDD's "spec drift" problem at the root. The Genome stays in sync with reality because it's updated as part of the development process, not maintained as a separate artifact.

Try It

npm install -g "@c-d-cc/reap"
reap init my-project
# Open Claude Code or OpenCode
> /reap.start
> /reap.evolve "Implement user authentication"
Enter fullscreen mode Exit fullscreen mode

REAP supports multiple AI agents — Claude Code and OpenCode today, with an extensible adapter system for adding more.

GitHub | Documentation | npm


What's your experience with spec-driven development? Have you found structure that works for AI-assisted development? I'd love to hear in the comments.


References:

Top comments (0)