<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tisha Chawla</title>
    <description>The latest articles on DEV Community by Tisha Chawla (@tisha_chawla).</description>
    <link>https://dev.to/tisha_chawla</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3960087%2F461d1521-802a-4dcb-b11e-7f2a7d88b7e6.png</url>
      <title>DEV Community: Tisha Chawla</title>
      <link>https://dev.to/tisha_chawla</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tisha_chawla"/>
    <language>en</language>
    <item>
      <title>Spec-Driven Development: When Structure Helps and When It Becomes Tax</title>
      <dc:creator>Tisha Chawla</dc:creator>
      <pubDate>Mon, 01 Jun 2026 12:01:02 +0000</pubDate>
      <link>https://dev.to/tisha_chawla/spec-driven-development-when-structure-helps-and-when-it-becomes-tax-1f66</link>
      <guid>https://dev.to/tisha_chawla/spec-driven-development-when-structure-helps-and-when-it-becomes-tax-1f66</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Disclosure:&lt;/strong&gt; I work at Microsoft. The views here are my own, and I've kept the tool comparisons evidence-based.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  1. The Ambiguity Tax
&lt;/h2&gt;

&lt;p&gt;Every vague requirement you hand an AI coding agent gets paid for later: in rework, in drift, in three files that each solved a slightly different version of the problem you never fully stated. I call this the &lt;strong&gt;ambiguity tax&lt;/strong&gt;, the compounding cost of letting an automated loop run on under-specified intent. A human engineer fills gaps with judgment and a quick Slack message; an agent fills them with confident guesses and then builds on those guesses at machine speed. By the time you read the diff, the misunderstanding is load-bearing.&lt;/p&gt;

&lt;p&gt;Spec-driven development (SDD) is, at its core, a strategy for paying this tax up front when it's cheap, instead of at review time when it's expensive. But there's a second tax most SDD advocates never mention, and it's the more interesting one.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. First, Define the Artifact
&lt;/h2&gt;

&lt;p&gt;Before the philosophy, the noun. A &lt;strong&gt;spec&lt;/strong&gt;, in this context, is not a Word document handed down from a product manager. It's a &lt;strong&gt;versioned, reviewable artifact that carries engineering intent into the agent's context&lt;/strong&gt;: a file (or set of files) that lives in the repo, moves through code review, and constrains what the agent generates. That's the whole shift. Intent moves out of ephemeral chat history and into something you can diff, comment on, and roll back.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. What SDD Actually Means
&lt;/h2&gt;

&lt;p&gt;Spec-driven development is the practice of making the spec, not the conversation, the primary unit of engineering work when collaborating with an AI agent. Instead of "prompt, code, fix, prompt again," you get "spec, plan, tasks, code, verify against spec." The artifact is the source of truth and the chat is just how you edit it. This sounds like a pure win. It isn't, which brings us to the tradeoff.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. The Core Tradeoff
&lt;/h2&gt;

&lt;p&gt;SDD lives between two failure modes. Too little structure produces the ambiguity tax: the agent guesses, drifts, and fragments. Too much structure produces what I'll call the &lt;strong&gt;Law of Surplus Structure&lt;/strong&gt;: every extra rule consumes the agent's finite reasoning budget, whether or not it reduces uncertainty. The entire craft of SDD is finding the floor of that curve, enough structure to kill ambiguity, not so much that you're burning tokens to enforce ceremony. Hold that U-shape in your head; everything below is about locating its bottom.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5u75eg0nb1yc9nk7hn8i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5u75eg0nb1yc9nk7hn8i.png" alt="The cost of structure is U-shaped: ambiguity cost falls as you add structure, surplus-structure cost rises, and total cost bottoms out at a sweet spot in between." width="800" height="494"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The picture is the whole argument. Ambiguity cost falls fast as you add the first bits of structure, then flattens. Surplus-structure cost starts near zero and climbs as ceremony piles up. Total cost is their sum, and it bottoms out well before "maximum structure." Everything past that minimum is you paying to make the agent dumber.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The Taxonomy: Three Levels of SDD
&lt;/h2&gt;

&lt;p&gt;Birgitta Böckeler's framing is the cleanest I've found: SDD isn't one thing, it's three levels of commitment.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;What persists&lt;/th&gt;
&lt;th&gt;Who edits what&lt;/th&gt;
&lt;th&gt;The spec is…&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Spec-first&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Code. Spec is scaffolding.&lt;/td&gt;
&lt;td&gt;You edit code after generation.&lt;/td&gt;
&lt;td&gt;A starting prompt you discard.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Spec-anchored&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spec &lt;strong&gt;and&lt;/strong&gt; code, kept in sync.&lt;/td&gt;
&lt;td&gt;You edit both; spec is reviewed.&lt;/td&gt;
&lt;td&gt;A durable contract.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Spec-as-source&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spec only. Code is a build output.&lt;/td&gt;
&lt;td&gt;You edit &lt;em&gt;only&lt;/em&gt; the spec.&lt;/td&gt;
&lt;td&gt;The source of truth; code is compiled from it.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Most teams think they're doing spec-anchored. Most are actually doing spec-first with extra steps: they write a spec, generate from it, then never touch it again. That's fine, as long as you're honest that the spec was a prompt, not a contract.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The Canonical Lifecycle Loop
&lt;/h2&gt;

&lt;p&gt;Strip away the tool branding and nearly every SDD workflow is the same six-stage loop.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Question it answers&lt;/th&gt;
&lt;th&gt;Output&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Explore&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What exists? What's the terrain?&lt;/td&gt;
&lt;td&gt;Shared understanding&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Specify&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What should be true when we're done?&lt;/td&gt;
&lt;td&gt;The spec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Plan&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How will we get there?&lt;/td&gt;
&lt;td&gt;Technical approach&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tasks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;What are the discrete steps?&lt;/td&gt;
&lt;td&gt;Ordered work items&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Implement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Build it.&lt;/td&gt;
&lt;td&gt;Code&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Verify&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Does it match the spec?&lt;/td&gt;
&lt;td&gt;Pass/fail + evidence&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Tools differ mostly in which stages they automate, which they force you to do explicitly, and how much each artifact weighs.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. The Ecosystem, Reframed by Architecture
&lt;/h2&gt;

&lt;p&gt;Most SDD tool round-ups list features. More useful is to sort tools by which architectural layer they operate on, because that's what determines whether two tools compete or compose.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.1 Intent Layer: "What should be true?"
&lt;/h3&gt;

&lt;p&gt;These tools turn fuzzy requirements into reviewable artifacts.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Maintainer&lt;/th&gt;
&lt;th&gt;Shape&lt;/th&gt;
&lt;th&gt;Best for&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/github/spec-kit" rel="noopener noreferrer"&gt;&lt;strong&gt;Spec Kit&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;GitHub&lt;/td&gt;
&lt;td&gt;Comprehensive, multi-file (spec/plan/tasks/contracts/constitution)&lt;/td&gt;
&lt;td&gt;Greenfield, large teams, strict specs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/Fission-AI/OpenSpec" rel="noopener noreferrer"&gt;&lt;strong&gt;OpenSpec&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Fission AI&lt;/td&gt;
&lt;td&gt;Lightweight, change-centric (~4 artifacts)&lt;/td&gt;
&lt;td&gt;Brownfield, fast iteration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://kiro.dev" rel="noopener noreferrer"&gt;&lt;strong&gt;Kiro&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;AWS&lt;/td&gt;
&lt;td&gt;Agentic IDE, multimodal input&lt;/td&gt;
&lt;td&gt;AWS/Claude users&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;a href="https://github.com/bmad-code-org/BMAD-METHOD" rel="noopener noreferrer"&gt;&lt;strong&gt;BMAD-METHOD&lt;/strong&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Community&lt;/td&gt;
&lt;td&gt;Multi-agent, role-simulating&lt;/td&gt;
&lt;td&gt;Enterprise-scale complexity&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The headline contrast: Spec Kit optimizes for completeness, OpenSpec optimizes for review cost. Spec Kit generates roughly 800 lines where OpenSpec generates roughly 250 for the same change. Whether that completeness is an asset or a tax depends entirely on your codebase, which is the whole point of this post.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.2 Execution Layer: "Build it, and check yourself."
&lt;/h3&gt;

&lt;p&gt;These don't replace the spec; they govern how the agent acts on it. &lt;a href="https://github.com/obra/superpowers" rel="noopener noreferrer"&gt;&lt;strong&gt;Superpowers&lt;/strong&gt;&lt;/a&gt; uses guided Q&amp;amp;A to clarify intent, then runs sub-agents behind a verification-before-completion gate. &lt;a href="https://github.com/gsd-build/get-shit-done" rel="noopener noreferrer"&gt;&lt;strong&gt;GSD&lt;/strong&gt;&lt;/a&gt; manages context in waves for solo developers. &lt;a href="https://microsoft.github.io/hve-core/" rel="noopener noreferrer"&gt;&lt;strong&gt;HVE Core&lt;/strong&gt;&lt;/a&gt; runs an RPI loop: Research, Plan, Implement, Review.&lt;/p&gt;

&lt;h3&gt;
  
  
  7.3 Orchestration Layer: "Coordinate many agents."
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://github.com/bradygaster/squad" rel="noopener noreferrer"&gt;&lt;strong&gt;Squad&lt;/strong&gt;&lt;/a&gt; coordinates parallel agents. &lt;a href="https://github.com/bmad-code-org/BMAD-METHOD" rel="noopener noreferrer"&gt;&lt;strong&gt;BMAD-METHOD&lt;/strong&gt;&lt;/a&gt; simulates a full agile team of specialized agents.&lt;/p&gt;

&lt;p&gt;The takeaway: Intent, Execution, and Orchestration tools compose. You can pair OpenSpec (intent) with Superpowers (execution). Picking "the best SDD tool" is the wrong question; picking one tool per layer is the right one.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. The Decision Filter
&lt;/h2&gt;

&lt;p&gt;Here's the part the methodology evangelists skip: you should not always write a spec. The signal isn't team size or "best practice," it's the cost of ambiguity for this specific change.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;Spec earns its keep&lt;/th&gt;
&lt;th&gt;Spec is just ceremony&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Blast radius&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Touches many modules / public APIs&lt;/td&gt;
&lt;td&gt;One file, contained&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reversibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hard to undo (migrations, schemas)&lt;/td&gt;
&lt;td&gt;Trivial to revert&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ambiguity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requirements genuinely unclear&lt;/td&gt;
&lt;td&gt;You already know the exact diff&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Audience&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Others must review/maintain&lt;/td&gt;
&lt;td&gt;Throwaway or solo-spike&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Repetition&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pattern you'll repeat 10×&lt;/td&gt;
&lt;td&gt;One-off&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If most of your signals sit in the right column, the spec &lt;em&gt;is&lt;/em&gt; the tax. Write the code.&lt;/p&gt;

&lt;p&gt;A composite from the kind of work this filter is built for (details anonymized; treat it as illustrative, not a case study): a payments service had a settlement module nobody wanted to touch, the original authors long gone, behavior documented only by the tests that happened to pass. The task was to add a new payout currency. Every signal sat in the left column: blast radius across a dozen call sites, an irreversible ledger migration, requirements that turned out to mean three different things depending on who you asked, and a change the on-call team would own for years. The first instinct was to let the agent loose on it. The right move was the opposite. An hour spent writing down what "settled" actually meant, in EARS form, surfaced two contradictions between the rounding rules and the reconciliation job before a single line changed. The spec didn't slow the work down; it caught the bug that would have shipped. That is the left column earning its keep. The same agent, pointed at a one-line config flag the week before, would have produced nothing but a longer paper trail.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. The Law of Surplus Structure
&lt;/h2&gt;

&lt;p&gt;The claim, stated plainly: every artifact you add to an agent's context consumes reasoning budget, and if it doesn't reduce uncertainty, it's not governance, it's tax. This isn't a vibe; it's measurable from two independent directions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direction one, token cost.&lt;/strong&gt; Jamie Telin ran OpenSpec against Spec Kit on the same task (streaming + session support for a chat app), twice, using GPT-5.2. The leaner framework won both times, and the gap was not small.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Measurement&lt;/th&gt;
&lt;th&gt;OpenSpec&lt;/th&gt;
&lt;th&gt;Spec Kit&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Test 1, total tokens&lt;/td&gt;
&lt;td&gt;~57,740&lt;/td&gt;
&lt;td&gt;~120,947&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+109%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test 2, planning&lt;/td&gt;
&lt;td&gt;38,117&lt;/td&gt;
&lt;td&gt;96,298&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+152%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test 2, implementation&lt;/td&gt;
&lt;td&gt;53,612&lt;/td&gt;
&lt;td&gt;84,742&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+58%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test 2, total&lt;/td&gt;
&lt;td&gt;91,729&lt;/td&gt;
&lt;td&gt;181,040&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+97%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;More upfront structure nearly doubled total token usage without improving outcomes. OpenSpec also hit a higher success rate with roughly 20% fewer assistant turns and 25% fewer tool calls. &lt;em&gt;(Source: Jamie Telin, "Spec Driven Development Is Wasting Tokens," Mar 2026.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direction two, a controlled study.&lt;/strong&gt; A 2026 paper from ETH Zurich, &lt;em&gt;Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?&lt;/em&gt; (Gloaguen, Mündler, Müller, Raychev, Vechev; arXiv, Feb 2026), tested the intuitive belief that handing an agent a structured repository overview helps it. They evaluated two settings: established SWE-bench tasks paired with LLM-generated context files written to the agent vendors' own recommendations, and a fresh collection of real-world issues drawn from repositories that already ship developer-written context files. The result cut against the intuition. Across multiple agents and models, context files &lt;em&gt;reduced&lt;/em&gt; task success rates compared with giving the agent no repository context at all, while raising inference cost by over 20%.&lt;/p&gt;

&lt;p&gt;Read that twice. Both the machine-written and the human-written files made outcomes worse on balance, not better, and they did it while costing more. The agents didn't ignore the files; they obeyed them, explored more broadly, ran more tests, traversed more files, and "thought" harder without producing better final patches. I call this failure mode the &lt;strong&gt;compliance loop trap&lt;/strong&gt;: the agent spends its cognitive budget satisfying the structural guardrails instead of solving the problem, and the diligence is real but misdirected. The authors' own conclusion is the thesis of this entire post: unnecessary requirements from context files make tasks harder, and human-written context should describe only minimal requirements. Everything beyond that is surplus. This is the second tax I promised in Section 1: ambiguity is expensive, and so is its overcorrection.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. Token Economics Is Architecture
&lt;/h2&gt;

&lt;p&gt;If structure has a token price, then context budget is an architectural resource to be allocated, not spent reflexively. Treat it like memory in an embedded system.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost driver&lt;/th&gt;
&lt;th&gt;Mitigation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Verbose, always-loaded specs&lt;/td&gt;
&lt;td&gt;Load specs lazily, scoped to the task&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Redundant restatement across artifacts&lt;/td&gt;
&lt;td&gt;Single source of truth per fact; reference, don't repeat&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sub-agents rebuilding context&lt;/td&gt;
&lt;td&gt;Pass distilled state, not full history&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-file divergence&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;State checkpoints&lt;/strong&gt;: snapshot agreed truth before fan-out&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The discipline: spend tokens where they reduce uncertainty, starve everything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  11. EARS: Making Natural Language Less Ambiguous
&lt;/h2&gt;

&lt;p&gt;If you're going to write requirements, write them in a form that resists misreading. &lt;strong&gt;EARS&lt;/strong&gt; (Easy Approach to Requirements Syntax), developed by Mavin et al. at Rolls-Royce and presented at the IEEE Requirements Engineering conference (RE'09), constrains prose into a small set of patterns, and it's been adopted at Airbus, Bosch, Dyson, Honeywell, Intel, NASA, Rolls-Royce, and Siemens. The template:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;While&lt;/strong&gt; &lt;code&gt;&amp;lt;optional pre-condition&amp;gt;&lt;/code&gt;, &lt;strong&gt;when&lt;/strong&gt; &lt;code&gt;&amp;lt;optional trigger&amp;gt;&lt;/code&gt;, the &lt;code&gt;&amp;lt;system name&amp;gt;&lt;/code&gt; &lt;strong&gt;shall&lt;/strong&gt; &lt;code&gt;&amp;lt;system response&amp;gt;&lt;/code&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Before&lt;/strong&gt;, the kind of requirement an agent will happily misinterpret:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The system should handle expired tokens gracefully and clean up sessions,
making sure not to leak any sensitive data.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What's "gracefully"? Clean up when? Leak to where? Each gap is a guess waiting to happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After&lt;/strong&gt;, EARS-structured and unambiguous:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;WHEN an identity token expires,
THE SYSTEM SHALL invalidate the active session cache within 500ms.

IF cache eviction fails,
THEN THE SYSTEM SHALL retry up to 3 times,
log a structured JSON error with a correlation ID,
and SHALL NOT persist plain-text PII in telemetry.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same intent, zero room for creative interpretation. Note that EARS &lt;em&gt;adds&lt;/em&gt; words but &lt;em&gt;removes&lt;/em&gt; uncertainty, which is exactly the trade the Law of Surplus Structure says is worth making. Structure that reduces ambiguity isn't tax; structure that merely decorates is.&lt;/p&gt;




&lt;h2&gt;
  
  
  12. The Reality Check
&lt;/h2&gt;

&lt;p&gt;Six failure modes I've watched SDD run into. None is a reason to abandon it; each is a reason to apply the decision filter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Review overload.&lt;/strong&gt; A spec that generates 800 lines of artifacts moves the bottleneck from writing code to reviewing specs. You haven't removed work, you've relocated it. If spec review is slower than the code review it replaced, the spec is tax.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;False control.&lt;/strong&gt; A detailed spec &lt;em&gt;feels&lt;/em&gt; like control, but the agent can satisfy every line and still produce something wrong, because the spec encoded your misunderstanding faithfully. Precision is not correctness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spec/code drift.&lt;/strong&gt; In spec-anchored workflows, the spec and code diverge the moment someone edits code directly and skips the spec. Now you have two sources of truth and no way to know which is right. Drift turns a contract back into a stale comment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The multi-file divergence trap.&lt;/strong&gt; When an agent fans out across many files, each can drift toward a different interpretation. State checkpoints, snapshotting agreed truth before parallel work, are the only reliable defense.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Natural language bottoms out.&lt;/strong&gt; Even EARS can't make "intuitive UX" machine-precise. Some intent is irreducibly fuzzy, and pretending otherwise just produces confident wrong answers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spec-as-source repeats old risks.&lt;/strong&gt; "Edit only the spec, regenerate the code" is the dream, but it reinvents the problems of code generation: opaque output, debugging a thing you didn't write, and trusting a compiler you can't fully inspect.&lt;/p&gt;




&lt;h2&gt;
  
  
  13. Adoption Strategy
&lt;/h2&gt;

&lt;p&gt;Don't roll out SDD as a mandate. Roll it out where the ambiguity tax is highest, prove it, then expand.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Focus&lt;/th&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Weeks 1 to 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pick one high-blast-radius, high-ambiguity workstream&lt;/td&gt;
&lt;td&gt;Feel where specs &lt;em&gt;earn&lt;/em&gt; their keep&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Weeks 3 to 4&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Add EARS for the requirements that bite&lt;/td&gt;
&lt;td&gt;Reduce misinterpretation, measure review time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Month 2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Introduce one execution-layer tool (e.g., a verification gate)&lt;/td&gt;
&lt;td&gt;Catch spec/code drift automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Month 3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Codify your &lt;em&gt;own&lt;/em&gt; decision filter&lt;/td&gt;
&lt;td&gt;Make "spec or skip?" a team reflex, not a ritual&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The goal isn't "we do SDD now." It's "we know exactly when SDD pays, and we skip it when it doesn't."&lt;/p&gt;




&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;Spec-driven development is not a methodology you adopt wholesale. It's a cost-management strategy for the two taxes that bracket every AI-assisted change: the ambiguity tax on the left, the surplus-structure tax on the right. Good engineering is finding the bottom of that curve, per change, not per team. So the rule is simple, and it's the whole post in one line:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Spec it when ambiguity is expensive. Skip it when the code is cheaper than the ceremony.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Further Reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Böckeler, B.&lt;/strong&gt;, &lt;em&gt;Exploring Generative AI&lt;/em&gt; (spec-driven development levels): &lt;a href="https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html" rel="noopener noreferrer"&gt;https://martinfowler.com/articles/exploring-gen-ai/sdd-3-tools.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gloaguen, T. et al.&lt;/strong&gt;, &lt;em&gt;Evaluating AGENTS.md: Are Repository-Level Context Files Helpful for Coding Agents?&lt;/em&gt; (arXiv 2602.11988): &lt;a href="https://arxiv.org/abs/2602.11988" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2602.11988&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mavin, A. et al.&lt;/strong&gt;, &lt;em&gt;Easy Approach to Requirements Syntax (EARS)&lt;/em&gt;, IEEE RE'09: &lt;a href="https://ieeexplore.ieee.org/document/5328509" rel="noopener noreferrer"&gt;https://ieeexplore.ieee.org/document/5328509&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telin, J.&lt;/strong&gt;, &lt;em&gt;Spec Driven Development Is Wasting Tokens&lt;/em&gt;: &lt;a href="https://medium.com/it-chronicles/is-your-safe-choice-burning-your-budget-1cfddf8782e4" rel="noopener noreferrer"&gt;https://medium.com/it-chronicles/is-your-safe-choice-burning-your-budget-1cfddf8782e4&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Negrisolo, V.&lt;/strong&gt;, &lt;em&gt;OpenSpec vs Spec Kit: Choosing the Right AI-Driven Development Workflow&lt;/em&gt; (the 800-vs-250-line comparison): &lt;a href="https://hashrocket.com/blog/posts/openspec-vs-spec-kit-choosing-the-right-ai-driven-development-workflow-for-your-team" rel="noopener noreferrer"&gt;https://hashrocket.com/blog/posts/openspec-vs-spec-kit-choosing-the-right-ai-driven-development-workflow-for-your-team&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub Spec Kit&lt;/strong&gt;: &lt;a href="https://github.com/github/spec-kit" rel="noopener noreferrer"&gt;https://github.com/github/spec-kit&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenSpec&lt;/strong&gt;: &lt;a href="https://github.com/Fission-AI/OpenSpec" rel="noopener noreferrer"&gt;https://github.com/Fission-AI/OpenSpec&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>softwaredevelopment</category>
      <category>architecture</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
