<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: guo king</title>
    <description>The latest articles on DEV Community by guo king (@speccoding).</description>
    <link>https://dev.to/speccoding</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3821717%2F3113cbab-97d2-49b7-abc0-1a8df46e1040.png</url>
      <title>DEV Community: guo king</title>
      <link>https://dev.to/speccoding</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/speccoding"/>
    <language>en</language>
    <item>
      <title>Vibe Coding vs Spec Coding: Same Refund Feature, Built Twice</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Tue, 16 Jun 2026 01:57:41 +0000</pubDate>
      <link>https://dev.to/speccoding/vibe-coding-vs-spec-coding-same-refund-feature-built-twice-62o</link>
      <guid>https://dev.to/speccoding/vibe-coding-vs-spec-coding-same-refund-feature-built-twice-62o</guid>
      <description>&lt;p&gt;Vibe coding is intoxicating. You describe what you want in plain language, the AI writes the code, and ten minutes later you have a working endpoint.&lt;/p&gt;

&lt;p&gt;I was sold — until I shipped a refund feature that way and spent the next two weeks patching bugs that a 90-minute spec would have prevented entirely.&lt;/p&gt;

&lt;p&gt;This is the side-by-side, using the exact same requirement, so you can see where the gap opens up.&lt;/p&gt;

&lt;h2&gt;
  
  
  The requirement
&lt;/h2&gt;

&lt;p&gt;An e-commerce platform needs an order refund feature. The PM's brief:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Support full and partial refunds&lt;/li&gt;
&lt;li&gt;Call the payment gateway (Stripe-style) to reverse the charge&lt;/li&gt;
&lt;li&gt;Track refund status: pending, processing, succeeded, failed&lt;/li&gt;
&lt;li&gt;Support agents trigger refunds through an internal tool&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Simple enough. Both paths start here.&lt;/p&gt;

&lt;h2&gt;
  
  
  Path A: vibe coding
&lt;/h2&gt;

&lt;p&gt;The prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Build me an order refund API in Node.js. Support full and partial refunds. Call a payment gateway to reverse the charge. Track refund status. Use Express and Postgres."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Sixty seconds later: a clean &lt;code&gt;RefundController&lt;/code&gt; with &lt;code&gt;createRefund&lt;/code&gt; and &lt;code&gt;getRefundStatus&lt;/code&gt;. It validates the order exists, checks the amount against the order total, calls &lt;code&gt;paymentGateway.refund()&lt;/code&gt;, saves the result. The code looks professional. The happy path works.&lt;/p&gt;

&lt;p&gt;Ship it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #1: the double refund
&lt;/h3&gt;

&lt;p&gt;A support agent clicks refund, the page hangs for a second, they click again. &lt;strong&gt;Two refunds go through.&lt;/strong&gt; No idempotency check.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fix prompt: "Add a check to prevent duplicate refunds for the same order."&lt;/em&gt; The AI adds a query: if a refund exists for this order, reject. Works — until it doesn't.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #2: partial refund overflow
&lt;/h3&gt;

&lt;p&gt;A $200 order. Support issues $50, then $80, then $100. &lt;strong&gt;Total refunded: $230.&lt;/strong&gt; The duplicate check only catches &lt;em&gt;exact&lt;/em&gt; duplicates, not cumulative amounts.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fix prompt: "Track cumulative refund amounts and reject refunds that would exceed the order total."&lt;/em&gt; The AI adds a &lt;code&gt;SUM(amount)&lt;/code&gt; query — but it's not in a transaction with the insert, so two concurrent partials can both pass the check.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #3: gateway timeout
&lt;/h3&gt;

&lt;p&gt;The gateway times out. The refund row sits at &lt;code&gt;processing&lt;/code&gt; forever. Support can't retry — the duplicate check blocks them. &lt;strong&gt;Did the money actually leave? Nobody knows.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fix prompt: "Add retry logic for gateway timeouts."&lt;/em&gt; The AI adds a retry loop: no exponential backoff, no idempotency key on the gateway call, no cap. The retry can now create a duplicate charge &lt;em&gt;on the gateway side&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Bug #4: the race condition
&lt;/h3&gt;

&lt;p&gt;Two agents process refunds for the same order simultaneously. Both pass the cumulative check (neither refund is committed yet), both hit the gateway, both succeed. &lt;strong&gt;The customer is refunded twice.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Fix prompt: "Add locking…"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Four patches in, each reasonable in isolation, and the architecture is a patchwork: no state machine, no documented invariants, no tests for how the patches interact.&lt;/p&gt;

&lt;h3&gt;
  
  
  The real cost
&lt;/h3&gt;

&lt;p&gt;The first version took 10 minutes. The four patches took &lt;strong&gt;two weeks&lt;/strong&gt; — investigation, testing, support escalations, and one manual reconciliation against gateway records.&lt;/p&gt;

&lt;p&gt;The "fast" approach wasn't fast. It front-loaded the dopamine and back-loaded the pain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Path B: spec coding
&lt;/h2&gt;

&lt;p&gt;Same requirement. Same AI. Different starting point — 90 minutes writing this before any code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Feature: Order Refund Processing&lt;/span&gt;

&lt;span class="gu"&gt;## Goal&lt;/span&gt;
Process refunds safely: no over-refund, no duplicate
processing, correct gateway reconciliation.

&lt;span class="gu"&gt;## Non-Goals&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Customer self-service refund portal (future phase)
&lt;span class="p"&gt;-&lt;/span&gt; Refund reason analytics
&lt;span class="p"&gt;-&lt;/span&gt; Automated approval rules

&lt;span class="gu"&gt;## State Machine&lt;/span&gt;
pending → processing → succeeded
pending → processing → failed → pending (retry)

Only ONE refund may be "processing" per order at any time.

&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;

Given an order with total $200 and $0 previously refunded
When a support agent requests a $50 refund
Then a refund record is created with status "pending"
  And the gateway is called with an idempotency key
  And on gateway success, status moves to "succeeded"
  And the refundable balance is now $150.

Given an order with total $200 and $150 already refunded
When a support agent requests a $75 refund
Then the request is rejected with "exceeds refundable balance"
  And no gateway call is made.

Given a refund in "processing" state
When another refund request arrives for the same order
Then the request is rejected with "refund already in progress"
  And no gateway call is made.

Given a refund in "processing" state
When the gateway times out
Then the status remains "processing"
  And a background job retries with exponential backoff
  And the retry uses the SAME idempotency key
  And after 3 failures, status moves to "failed"
  And an alert goes to the payments team.

&lt;span class="gu"&gt;## Edge Cases&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Concurrency: SELECT FOR UPDATE on the order row before
  checking refundable balance
&lt;span class="p"&gt;-&lt;/span&gt; Idempotency: each refund attempt gets a UUID, passed to
  the gateway as the idempotency key
&lt;span class="p"&gt;-&lt;/span&gt; Precision: all amounts in cents (integer), no floats
&lt;span class="p"&gt;-&lt;/span&gt; Reconciliation: nightly job compares local records
  against the gateway settlement report

&lt;span class="gu"&gt;## Rollback Plan&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Feature flag: refund_processing_v2
&lt;span class="p"&gt;-&lt;/span&gt; Rollback disables new refunds; in-flight ones continue
  via the background job
&lt;span class="p"&gt;-&lt;/span&gt; Additive schema only — no migration rollback needed
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Implement the refund feature described in this spec. Follow the state machine exactly. Use SELECT FOR UPDATE for concurrency control. Include the idempotency key in all gateway calls. All amounts in cents." &lt;em&gt;[paste spec]&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The output is structurally different
&lt;/h2&gt;

&lt;p&gt;The AI generates, in the &lt;strong&gt;first version&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;processRefund&lt;/code&gt; wrapped in a transaction with &lt;code&gt;SELECT FOR UPDATE&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Cumulative balance check &lt;em&gt;inside&lt;/em&gt; the transaction — no race window&lt;/li&gt;
&lt;li&gt;Idempotency UUID minted at creation, passed to every gateway call&lt;/li&gt;
&lt;li&gt;Background retry with exponential backoff, capped at 3 attempts&lt;/li&gt;
&lt;li&gt;State transitions that match the spec's machine exactly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every bug from Path A is pre-handled. Double refund? The lock plus the idempotency key. Overflow? Balance check in the same transaction as the insert. Timeout? Same-key retry that the gateway treats as safe.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Same AI. Same capability. Dramatically different output — because the input was dramatically different. The AI didn't get smarter; it got better constraints.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest math
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Vibe coding&lt;/th&gt;
&lt;th&gt;Spec coding&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Time to first version&lt;/td&gt;
&lt;td&gt;10 min&lt;/td&gt;
&lt;td&gt;~2.5 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Production bugs&lt;/td&gt;
&lt;td&gt;4 (one involving real money)&lt;/td&gt;
&lt;td&gt;0 in this scenario&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total time to stable&lt;/td&gt;
&lt;td&gt;~2 weeks&lt;/td&gt;
&lt;td&gt;~half a day&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Vibe coding is great for prototypes, internal tools, and anything where a bug costs you a shrug. The moment money, state machines, or concurrency enter the picture, the 90 minutes you "save" by skipping the spec gets repaid at loan-shark interest.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Adapted from the full case study on &lt;a href="https://spec-coding.dev/blog/same-refund-feature-vibe-coding-vs-spec-coding" rel="noopener noreferrer"&gt;Spec Coding&lt;/a&gt;. The site maintains free spec templates and a browser-based &lt;a href="https://spec-coding.dev/tools/ai-coding-spec-packet" rel="noopener noreferrer"&gt;spec packet generator&lt;/a&gt; for exactly this workflow.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>Acceptance Criteria Your QA Can Run Without Asking You Anything (6 Copyable Examples)</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Tue, 16 Jun 2026 01:57:13 +0000</pubDate>
      <link>https://dev.to/speccoding/acceptance-criteria-your-qa-can-run-without-asking-you-anything-6-copyable-examples-3mif</link>
      <guid>https://dev.to/speccoding/acceptance-criteria-your-qa-can-run-without-asking-you-anything-6-copyable-examples-3mif</guid>
      <description>&lt;p&gt;The acceptance criterion I reach for most is not the happy path. It's the &lt;strong&gt;duplicate action&lt;/strong&gt;: double-clicked submit buttons, repeated webhook deliveries, retried payment calls, imported rows seen twice.&lt;/p&gt;

&lt;p&gt;That one scenario exposes more vague thinking than a dozen success cases:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a user submits the same request twice within 2 seconds
&lt;span class="nf"&gt;When &lt;/span&gt;both requests reach the server
&lt;span class="nf"&gt;Then &lt;/span&gt;exactly one state change is recorded
  &lt;span class="nf"&gt;And &lt;/span&gt;the second response returns the first operation id
  &lt;span class="nf"&gt;And &lt;/span&gt;the audit log marks it as a replay
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If your team can't answer what happens in that scenario, the spec isn't done. Here's the format that forces the answer, plus six examples you can copy.&lt;/p&gt;

&lt;h2&gt;
  
  
  The anatomy of a criterion that works
&lt;/h2&gt;

&lt;p&gt;Every Given/When/Then has three parts with specific jobs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Given&lt;/strong&gt; — the precondition. &lt;em&gt;What must be true before the action occurs?&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When&lt;/strong&gt; — the trigger. &lt;em&gt;What specific event causes the behavior?&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Then&lt;/strong&gt; — the observable outcome. &lt;em&gt;How does someone testing this know it worked?&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The quality bar: &lt;strong&gt;QA can turn it into a test without asking the author what they meant.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;"The system responds quickly" fails the bar — quickly could be 200ms or 5 seconds depending on who's reading. "The API responds within 500ms at p95" passes.&lt;/p&gt;

&lt;p&gt;Bad:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"The login should work correctly and handle errors gracefully."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Good:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a registered user is on the login page
&lt;span class="nf"&gt;When &lt;/span&gt;the user enters a valid email and correct password and clicks &lt;span class="s"&gt;"Sign in"&lt;/span&gt;
&lt;span class="nf"&gt;Then &lt;/span&gt;the user is redirected to the dashboard within 2 seconds;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;session&lt;/span&gt; &lt;span class="err"&gt;cookie&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;set&lt;/span&gt; &lt;span class="err"&gt;with&lt;/span&gt; &lt;span class="err"&gt;HttpOnly&lt;/span&gt; &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;Secure&lt;/span&gt; &lt;span class="err"&gt;flags;&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;last_login_at&lt;/span&gt; &lt;span class="err"&gt;timestamp&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;updated&lt;/span&gt; &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;users&lt;/span&gt; &lt;span class="err"&gt;table.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three observable outcomes. Zero interpretation needed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 1: Account lockout
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a registered user with email &lt;span class="s"&gt;"jane@example.com"&lt;/span&gt;
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;account&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;not&lt;/span&gt; &lt;span class="err"&gt;currently&lt;/span&gt; &lt;span class="err"&gt;locked&lt;/span&gt;
&lt;span class="nf"&gt;When &lt;/span&gt;the user submits an incorrect password 3 times consecutively
  &lt;span class="err"&gt;within&lt;/span&gt; &lt;span class="nf"&gt;a &lt;/span&gt;10-minute window
&lt;span class="nf"&gt;Then &lt;/span&gt;the account is locked for 15 minutes;
  &lt;span class="err"&gt;subsequent&lt;/span&gt; &lt;span class="err"&gt;login&lt;/span&gt; &lt;span class="err"&gt;attempts&lt;/span&gt; &lt;span class="err"&gt;return&lt;/span&gt; &lt;span class="err"&gt;HTTP&lt;/span&gt; &lt;span class="err"&gt;429&lt;/span&gt; &lt;span class="err"&gt;with&lt;/span&gt; &lt;span class="err"&gt;body&lt;/span&gt;
  &lt;span class="err"&gt;{"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;"account_locked", "retry_after_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;900};&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;form&lt;/span&gt; &lt;span class="err"&gt;displays&lt;/span&gt; &lt;span class="err"&gt;"Account&lt;/span&gt; &lt;span class="err"&gt;locked.&lt;/span&gt; &lt;span class="err"&gt;Try&lt;/span&gt; &lt;span class="err"&gt;again&lt;/span&gt; &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;15&lt;/span&gt; &lt;span class="err"&gt;minutes.";&lt;/span&gt;
  &lt;span class="nf"&gt;a &lt;/span&gt;login_lockout event is written to the audit_log table;
  &lt;span class="err"&gt;after&lt;/span&gt; &lt;span class="err"&gt;15&lt;/span&gt; &lt;span class="err"&gt;minutes&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;account&lt;/span&gt; &lt;span class="err"&gt;unlocks&lt;/span&gt; &lt;span class="err"&gt;automatically.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice what's pinned down: the window (10 min), the count (3), the duration (15 min), the status code, the error body, the UI copy, &lt;em&gt;and&lt;/em&gt; the audit trail.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 2: Session expiry
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a user is logged in
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;session&lt;/span&gt; &lt;span class="err"&gt;timeout&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;configured&lt;/span&gt; &lt;span class="err"&gt;to&lt;/span&gt; &lt;span class="err"&gt;30&lt;/span&gt; &lt;span class="err"&gt;minutes&lt;/span&gt; &lt;span class="err"&gt;of&lt;/span&gt; &lt;span class="err"&gt;inactivity&lt;/span&gt;
&lt;span class="nf"&gt;When &lt;/span&gt;the user performs no actions for 30 consecutive minutes
&lt;span class="nf"&gt;Then &lt;/span&gt;the next request returns HTTP 401;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;session&lt;/span&gt; &lt;span class="err"&gt;cookie&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;cleared&lt;/span&gt; &lt;span class="err"&gt;server-side;&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;user&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;redirected&lt;/span&gt; &lt;span class="err"&gt;to&lt;/span&gt; &lt;span class="err"&gt;/login?reason=session_expired;&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;page&lt;/span&gt; &lt;span class="err"&gt;displays&lt;/span&gt; &lt;span class="err"&gt;"Your&lt;/span&gt; &lt;span class="err"&gt;session&lt;/span&gt; &lt;span class="err"&gt;expired&lt;/span&gt; &lt;span class="err"&gt;due&lt;/span&gt; &lt;span class="err"&gt;to&lt;/span&gt; &lt;span class="err"&gt;inactivity.";&lt;/span&gt;
  &lt;span class="err"&gt;unsaved&lt;/span&gt; &lt;span class="err"&gt;client&lt;/span&gt; &lt;span class="err"&gt;form&lt;/span&gt; &lt;span class="err"&gt;dat&lt;/span&gt;&lt;span class="nf"&gt;a &lt;/span&gt;is NOT recoverable
  &lt;span class="err"&gt;(known&lt;/span&gt; &lt;span class="err"&gt;limitation,&lt;/span&gt; &lt;span class="err"&gt;documented&lt;/span&gt; &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;UI).&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last line is the underrated move: &lt;strong&gt;writing down what the system deliberately does NOT do.&lt;/strong&gt; It converts a future bug report into a documented decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 3: Checkout with insufficient stock
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a customer has 3 units of SKU-2087 in their cart
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;available_quantity&lt;/span&gt; &lt;span class="err"&gt;for&lt;/span&gt; &lt;span class="err"&gt;SKU-208&lt;/span&gt;&lt;span class="nf"&gt;7 &lt;/span&gt;is now 1
&lt;span class="nf"&gt;When &lt;/span&gt;the customer clicks &lt;span class="s"&gt;"Place Order"&lt;/span&gt;
&lt;span class="nf"&gt;Then &lt;/span&gt;the order is NOT placed;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;page&lt;/span&gt; &lt;span class="err"&gt;shows&lt;/span&gt; &lt;span class="err"&gt;"Some&lt;/span&gt; &lt;span class="err"&gt;items&lt;/span&gt; &lt;span class="err"&gt;are&lt;/span&gt; &lt;span class="err"&gt;no&lt;/span&gt; &lt;span class="err"&gt;longer&lt;/span&gt; &lt;span class="err"&gt;available&lt;/span&gt;
  &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;requested&lt;/span&gt; &lt;span class="err"&gt;quantity.";&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;line&lt;/span&gt; &lt;span class="err"&gt;item&lt;/span&gt; &lt;span class="err"&gt;shows&lt;/span&gt; &lt;span class="err"&gt;"Only&lt;/span&gt; &lt;span class="err"&gt;1&lt;/span&gt; &lt;span class="err"&gt;available&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="err"&gt;please&lt;/span&gt; &lt;span class="err"&gt;update&lt;/span&gt; &lt;span class="err"&gt;quantity.";&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;customer&lt;/span&gt; &lt;span class="err"&gt;c&lt;/span&gt;&lt;span class="nf"&gt;an &lt;/span&gt;update to 1 and retry;
  &lt;span class="err"&gt;if&lt;/span&gt; &lt;span class="err"&gt;available_quantity&lt;/span&gt; &lt;span class="err"&gt;drops&lt;/span&gt; &lt;span class="err"&gt;to&lt;/span&gt; &lt;span class="err"&gt;0&lt;/span&gt; &lt;span class="err"&gt;between&lt;/span&gt; &lt;span class="err"&gt;page&lt;/span&gt; &lt;span class="err"&gt;load&lt;/span&gt; &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;submit,&lt;/span&gt;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;message&lt;/span&gt; &lt;span class="err"&gt;reads&lt;/span&gt; &lt;span class="err"&gt;"SKU-208&lt;/span&gt;&lt;span class="nf"&gt;7 &lt;/span&gt;is out of stock"
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;line&lt;/span&gt; &lt;span class="err"&gt;item&lt;/span&gt; &lt;span class="err"&gt;shows&lt;/span&gt; &lt;span class="nf"&gt;a &lt;/span&gt;&lt;span class="s"&gt;"Remove"&lt;/span&gt; button only;
  &lt;span class="err"&gt;no&lt;/span&gt; &lt;span class="err"&gt;payment&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;captured&lt;/span&gt; &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;AN&lt;/span&gt;&lt;span class="nf"&gt;Y &lt;/span&gt;insufficient-stock scenario.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The two-tier degradation (low stock vs zero stock) and the final invariant are what make this executable instead of decorative.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 4: Rate limiting
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;the rate limit for the &lt;span class="s"&gt;"standard"&lt;/span&gt; plan is 100 requests/minute
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;key&lt;/span&gt; &lt;span class="err"&gt;"key_abc123"&lt;/span&gt; &lt;span class="err"&gt;has&lt;/span&gt; &lt;span class="err"&gt;made&lt;/span&gt; &lt;span class="err"&gt;100&lt;/span&gt; &lt;span class="err"&gt;requests&lt;/span&gt; &lt;span class="err"&gt;in&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;current&lt;/span&gt; &lt;span class="err"&gt;window&lt;/span&gt;
&lt;span class="nf"&gt;When &lt;/span&gt;the consumer sends request &lt;span class="c"&gt;#101 in the same window&lt;/span&gt;
&lt;span class="nf"&gt;Then &lt;/span&gt;the response is 429 with body
  &lt;span class="err"&gt;{"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;"rate_limit_exceeded", "retry_after_seconds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="err"&gt;&amp;lt;remaining&amp;gt;};&lt;/span&gt;
  &lt;span class="err"&gt;headers&lt;/span&gt; &lt;span class="err"&gt;include&lt;/span&gt; &lt;span class="err"&gt;X-RateLimit-Limit,&lt;/span&gt; &lt;span class="err"&gt;X-RateLimit-Remaining,&lt;/span&gt;
  &lt;span class="err"&gt;X-RateLimit-Reset,&lt;/span&gt; &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;Retry-After;&lt;/span&gt;
  &lt;span class="err"&gt;successful&lt;/span&gt; &lt;span class="err"&gt;responses&lt;/span&gt; &lt;span class="err"&gt;ALSO&lt;/span&gt; &lt;span class="err"&gt;carry&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;X-RateLimit-&lt;/span&gt;&lt;span class="nf"&gt;* &lt;/span&gt;headers;
  &lt;span class="err"&gt;limits&lt;/span&gt; &lt;span class="err"&gt;are&lt;/span&gt; &lt;span class="err"&gt;scoped&lt;/span&gt; &lt;span class="err"&gt;per&lt;/span&gt; &lt;span class="err"&gt;AP&lt;/span&gt;&lt;span class="nf"&gt;I &lt;/span&gt;key, not per IP;
  &lt;span class="err"&gt;429&lt;/span&gt; &lt;span class="err"&gt;responses&lt;/span&gt; &lt;span class="err"&gt;are&lt;/span&gt; &lt;span class="err"&gt;NOT&lt;/span&gt; &lt;span class="err"&gt;counted&lt;/span&gt; &lt;span class="err"&gt;against&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;next&lt;/span&gt; &lt;span class="err"&gt;window.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The last two lines settle the arguments your team would otherwise have in the PR thread.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example 5: CSV import with duplicates
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;an admin uploads a 10,000-row CSV to /admin/contacts/import
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;200&lt;/span&gt; &lt;span class="err"&gt;rows&lt;/span&gt; &lt;span class="err"&gt;have&lt;/span&gt; &lt;span class="err"&gt;emails&lt;/span&gt; &lt;span class="err"&gt;that&lt;/span&gt; &lt;span class="err"&gt;already&lt;/span&gt; &lt;span class="err"&gt;exist&lt;/span&gt;
&lt;span class="nf"&gt;When &lt;/span&gt;the import job processes the file
&lt;span class="nf"&gt;Then &lt;/span&gt;9,800 contacts are created and 200 are skipped (not updated);
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;result&lt;/span&gt; &lt;span class="err"&gt;page&lt;/span&gt; &lt;span class="err"&gt;shows&lt;/span&gt; &lt;span class="err"&gt;Total/Created/Skipped/Errors&lt;/span&gt; &lt;span class="err"&gt;counts;&lt;/span&gt;
  &lt;span class="nf"&gt;a &lt;/span&gt;CSV of skipped rows is downloadable with row_number, email, reason;
  &lt;span class="err"&gt;email&lt;/span&gt; &lt;span class="err"&gt;comparison&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;case-insensitive;&lt;/span&gt;
  &lt;span class="nf"&gt;a &lt;/span&gt;malformed row counts under &lt;span class="s"&gt;"Errors"&lt;/span&gt; and the rest continue;
  &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;import&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;atomic&lt;/span&gt; &lt;span class="err"&gt;per-row,&lt;/span&gt; &lt;span class="err"&gt;not&lt;/span&gt; &lt;span class="err"&gt;per-file&lt;/span&gt; &lt;span class="err"&gt;—&lt;/span&gt; &lt;span class="nf"&gt;an &lt;/span&gt;interruption
  &lt;span class="err"&gt;at&lt;/span&gt; &lt;span class="err"&gt;row&lt;/span&gt; &lt;span class="err"&gt;5,000&lt;/span&gt; &lt;span class="err"&gt;leaves&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;first&lt;/span&gt; &lt;span class="err"&gt;5,000&lt;/span&gt; &lt;span class="err"&gt;committed.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Example 6: Notification retry
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;the service sends an email via the SMTP provider
&lt;span class="nf"&gt;When &lt;/span&gt;the provider returns a transient error (timeout, 5xx, DNS)
&lt;span class="err"&gt;Then the notification enters a retry queue with exponential backoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="err"&gt;1&lt;/span&gt; &lt;span class="err"&gt;min&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="err"&gt;5&lt;/span&gt; &lt;span class="err"&gt;min&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="err"&gt;15&lt;/span&gt; &lt;span class="err"&gt;min&lt;/span&gt; &lt;span class="err"&gt;→&lt;/span&gt; &lt;span class="err"&gt;60&lt;/span&gt; &lt;span class="err"&gt;min;&lt;/span&gt;
  &lt;span class="err"&gt;each&lt;/span&gt; &lt;span class="err"&gt;attempt&lt;/span&gt; &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;logged&lt;/span&gt; &lt;span class="err"&gt;with&lt;/span&gt; &lt;span class="err"&gt;attempt_number&lt;/span&gt; &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;next_retry_at;&lt;/span&gt;
  &lt;span class="err"&gt;after&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;4th&lt;/span&gt; &lt;span class="err"&gt;failure,&lt;/span&gt; &lt;span class="err"&gt;status&lt;/span&gt; &lt;span class="err"&gt;=&lt;/span&gt; &lt;span class="err"&gt;"permanently_failed"&lt;/span&gt;
  &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;an &lt;/span&gt;alert fires with the full error history;
  &lt;span class="nf"&gt;a &lt;/span&gt;PERMANENT error (550 mailbox not found) gets NO retries —
  &lt;span class="err"&gt;status&lt;/span&gt; &lt;span class="err"&gt;fails&lt;/span&gt; &lt;span class="err"&gt;immediately&lt;/span&gt; &lt;span class="err"&gt;and&lt;/span&gt; &lt;span class="err"&gt;the&lt;/span&gt; &lt;span class="err"&gt;user's&lt;/span&gt; &lt;span class="err"&gt;email_verified&lt;/span&gt; &lt;span class="err"&gt;flag&lt;/span&gt;
  &lt;span class="err"&gt;is&lt;/span&gt; &lt;span class="err"&gt;set&lt;/span&gt; &lt;span class="err"&gt;to&lt;/span&gt; &lt;span class="err"&gt;false.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The transient/permanent split is the difference between a retry policy and a retry &lt;em&gt;loop&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5 mistakes that make criteria useless
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Too vague&lt;/strong&gt; — "the system works correctly." Untestable.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implementation instead of behavior&lt;/strong&gt; — "uses Redis cache." The criterion should survive a tech-stack change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple behaviors in one criterion&lt;/strong&gt; — if it has three Whens, it's three criteria.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Happy path only&lt;/strong&gt; — the error paths are where the incidents live.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Untestable performance claims&lt;/strong&gt; — "fast" is not a number. "p95 &amp;lt; 500ms" is.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Steal this blank template
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;

&lt;span class="gu"&gt;### Happy path&lt;/span&gt;
Given &lt;span class="nt"&gt;&amp;lt;precondition&amp;gt;&lt;/span&gt;
When &lt;span class="nt"&gt;&amp;lt;action&amp;gt;&lt;/span&gt;
Then &lt;span class="nt"&gt;&amp;lt;observable&lt;/span&gt; &lt;span class="na"&gt;outcome&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;; &lt;span class="nt"&gt;&amp;lt;outcome&amp;gt;&lt;/span&gt;; &lt;span class="nt"&gt;&amp;lt;outcome&amp;gt;&lt;/span&gt;.

&lt;span class="gu"&gt;### Error handling&lt;/span&gt;
Given &lt;span class="nt"&gt;&amp;lt;precondition&amp;gt;&lt;/span&gt;
When &lt;span class="nt"&gt;&amp;lt;failure&lt;/span&gt; &lt;span class="na"&gt;trigger&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
Then &lt;span class="nt"&gt;&amp;lt;error&lt;/span&gt; &lt;span class="na"&gt;response&lt;/span&gt; &lt;span class="na"&gt;with&lt;/span&gt; &lt;span class="na"&gt;exact&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;;
  &lt;span class="nt"&gt;&amp;lt;state&lt;/span&gt; &lt;span class="na"&gt;that&lt;/span&gt; &lt;span class="na"&gt;must&lt;/span&gt; &lt;span class="na"&gt;NOT&lt;/span&gt; &lt;span class="na"&gt;change&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;;
  &lt;span class="nt"&gt;&amp;lt;what&lt;/span&gt; &lt;span class="na"&gt;gets&lt;/span&gt; &lt;span class="na"&gt;logged&lt;/span&gt;&lt;span class="err"&gt;/&lt;/span&gt;&lt;span class="na"&gt;alerted&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;.

&lt;span class="gu"&gt;### Edge cases&lt;/span&gt;
Given &lt;span class="nt"&gt;&amp;lt;boundary&lt;/span&gt; &lt;span class="na"&gt;condition:&lt;/span&gt; &lt;span class="na"&gt;duplicate&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;concurrent&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;empty&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;max&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
When &lt;span class="nt"&gt;&amp;lt;action&amp;gt;&lt;/span&gt;
Then &lt;span class="nt"&gt;&amp;lt;deterministic&lt;/span&gt; &lt;span class="na"&gt;outcome&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;.

&lt;span class="gu"&gt;### Performance&lt;/span&gt;
Given &lt;span class="nt"&gt;&amp;lt;load&lt;/span&gt; &lt;span class="na"&gt;condition&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
When &lt;span class="nt"&gt;&amp;lt;action&amp;gt;&lt;/span&gt;
Then &lt;span class="nt"&gt;&amp;lt;metric&lt;/span&gt; &lt;span class="na"&gt;with&lt;/span&gt; &lt;span class="na"&gt;number&lt;/span&gt; &lt;span class="na"&gt;and&lt;/span&gt; &lt;span class="na"&gt;percentile&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;p&gt;&lt;em&gt;This is a condensed cut of the full guide — all 20 examples across auth, e-commerce, APIs, data processing, and notifications — on &lt;a href="https://spec-coding.dev/blog/acceptance-criteria-examples-guide" rel="noopener noreferrer"&gt;Spec Coding&lt;/a&gt;. There's also a free &lt;a href="https://spec-coding.dev/tools/gherkin-generator" rel="noopener noreferrer"&gt;Gherkin generator&lt;/a&gt; if you want the format scaffolded for you.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>testing</category>
      <category>qa</category>
      <category>agile</category>
      <category>productivity</category>
    </item>
    <item>
      <title>'Retry on Error' Is Not a Payment Spec — Write the Failure Matrix Instead</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Tue, 16 Jun 2026 01:56:54 +0000</pubDate>
      <link>https://dev.to/speccoding/retry-on-error-is-not-a-payment-spec-write-the-failure-matrix-instead-2246</link>
      <guid>https://dev.to/speccoding/retry-on-error-is-not-a-payment-spec-write-the-failure-matrix-instead-2246</guid>
      <description>&lt;p&gt;Most payment specs I've reviewed describe the happy path in three pages and the failure behavior in one sentence: &lt;em&gt;"retry on error."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That sentence is where 80% of the production incidents come from.&lt;/p&gt;

&lt;p&gt;A payment workflow spec earns its keep by naming — category by category — exactly what the system does when the card network, the issuer, or the customer refuses to cooperate. Here's the structure I use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Start with a failure taxonomy, not a flowchart
&lt;/h2&gt;

&lt;p&gt;Before drawing a single box, force the spec to answer one question: &lt;strong&gt;what are the categories of failure this workflow can produce?&lt;/strong&gt; I insist on five, because collapsing them into "error" is what creates the mess:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Network timeout.&lt;/strong&gt; The processor never answered. The charge may or may not exist. This is the &lt;em&gt;only&lt;/em&gt; category where retrying with the same idempotency key is mandatory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Soft decline.&lt;/strong&gt; The issuer said no for a recoverable reason — insufficient funds, do-not-honor, expired card. Retry is allowed, but only with customer action or a later attempt window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard decline.&lt;/strong&gt; Stolen card, pickup card, fraud. Retrying is never correct, and on some networks it increases your risk score. The spec must forbid it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fraud review.&lt;/strong&gt; The processor accepted the auth but punted on a decision. The response is async. The spec must describe the waiting state and the webhook that ends it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3DS challenge.&lt;/strong&gt; The issuer demanded Strong Customer Authentication. This is &lt;em&gt;not a failure&lt;/em&gt; — it's a branch. The customer sees a redirect or iframe, and the workflow pauses until they finish.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Every downstream decision — retry policy, user messaging, observability — hangs off this five-row matrix. If the taxonomy is wrong, nothing below it will save you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Retry rules that match the category, not the HTTP status
&lt;/h2&gt;

&lt;p&gt;The rule I write into every payment spec, word for word: &lt;strong&gt;retry policy is a function of failure category, not of HTTP status.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A 402 from Stripe can be either a soft decline you should let the customer fix or a hard decline you should never touch again. The spec has to branch on the processor's &lt;em&gt;decline code&lt;/em&gt;, not the transport code.&lt;/p&gt;

&lt;p&gt;Concretely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;card_declined&lt;/code&gt; + &lt;code&gt;decline_code: insufficient_funds&lt;/code&gt; → up to three retries spaced by the dunning schedule, each gated on a customer action or a scheduled job.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;card_declined&lt;/code&gt; + &lt;code&gt;decline_code: stolen_card&lt;/code&gt; → permanent flag on the payment method; any subsequent attempt fails closed &lt;em&gt;before hitting the network&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Connection error or 5xx with no response body → immediate retry &lt;strong&gt;with the same &lt;code&gt;Idempotency-Key&lt;/code&gt;&lt;/strong&gt;, because the processor may have already charged the card and a fresh key would double-charge.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Idempotency keys belong to the attempt, not the request
&lt;/h2&gt;

&lt;p&gt;The single most common mistake I see: idempotency keys scoped to the HTTP request instead of the logical attempt.&lt;/p&gt;

&lt;p&gt;If a timeout triggers a retry and the retry generates a &lt;em&gt;new&lt;/em&gt; key, the processor treats it as a &lt;em&gt;new charge&lt;/em&gt;. The spec must say, in one sentence: &lt;strong&gt;the key is minted when the attempt begins and survives every retransmission inside that attempt.&lt;/strong&gt; A new attempt — a new customer action, a new dunning cycle, a new order — gets a new key. Nothing in between does.&lt;/p&gt;

&lt;p&gt;Also spec the key's lifetime. Processors typically honor keys for 24 hours. If a retry crosses that boundary, reconcile against the processor's ledger (list charges by metadata) instead of assuming the retry is safe.&lt;/p&gt;

&lt;h2&gt;
  
  
  3DS is a first-class state, not an error
&lt;/h2&gt;

&lt;p&gt;If the spec treats SCA as an error, the frontend will do something stupid like show a red banner while the issuer is mid-challenge.&lt;/p&gt;

&lt;p&gt;The spec needs a state called &lt;code&gt;requires_action&lt;/code&gt; (or whatever your processor calls it) with explicit transitions: entered when the auth returns a challenge URL, exited when the webhook confirms success or failure.&lt;/p&gt;

&lt;p&gt;Spec the two flavors separately:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-flow challenge:&lt;/strong&gt; the client SDK mounts the iframe, blocks interaction, resolves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redirect challenge:&lt;/strong&gt; the browser navigates to the issuer's ACS URL and comes back to a return URL you control.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Nail down the return URL, the query params you expect, and &lt;strong&gt;what happens if the customer closes the tab mid-challenge&lt;/strong&gt;. That last case always gets forgotten, and it's the one that produces stuck subscriptions in production.&lt;/p&gt;

&lt;h2&gt;
  
  
  Auth, capture, and the seven-day cliff
&lt;/h2&gt;

&lt;p&gt;If your workflow does auth-now / capture-later, the spec must call out auth expiry. Most processors auto-void an uncaptured auth at roughly seven days (Stripe is 7, Adyen varies by scheme, some schemes are shorter for debit).&lt;/p&gt;

&lt;p&gt;The spec needs to answer: &lt;em&gt;what happens if the fulfillment job runs on day 8?&lt;/em&gt; My answer is always the same — require a fresh auth before capture attempts beyond day 5, and treat any capture against an expired auth as a hard failure that opens a &lt;em&gt;new&lt;/em&gt; authorization, not a retry.&lt;/p&gt;

&lt;p&gt;Multi-capture makes this worse. Spec the partial-capture order, whether over-capture is permitted (usually not), and how refund-before-final-capture interacts with the remaining authorized amount. I've watched a team discover at 2am that their "simple" refund reduced the captureable balance to zero and killed the next shipment.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dunning is a state machine — write it down
&lt;/h2&gt;

&lt;p&gt;For subscription failures, the spec should contain the dunning schedule &lt;em&gt;verbatim&lt;/em&gt;, not a vague "we will retry":&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Attempt&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;th&gt;Side effect&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Immediate, at renewal&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;+3 days&lt;/td&gt;
&lt;td&gt;Silent retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;+7 days&lt;/td&gt;
&lt;td&gt;Email 24h earlier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;+14 days&lt;/td&gt;
&lt;td&gt;Final-notice email&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cancel&lt;/td&gt;
&lt;td&gt;+21 days&lt;/td&gt;
&lt;td&gt;Access revoked at next cycle boundary&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each transition is a row in a state table: previous state, trigger, new state, side effects. Without this table the team re-argues the schedule every quarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The webhook is the source of truth
&lt;/h2&gt;

&lt;p&gt;Non-negotiable clause: &lt;strong&gt;the synchronous response from the processor is advisory. The webhook is the ledger.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Forbid any state transition derived only from the API response — capture confirmed, refund settled, dispute opened, 3DS completed all wait for the corresponding event.&lt;/p&gt;

&lt;p&gt;Concrete consequence: the spec needs an outbox or a reconciliation job. If the webhook is delayed, the UI shows "processing" longer than the customer expects. The spec owns that tradeoff and picks a timeout after which a job polls the processor directly. I pick 30 seconds for interactive flows, 15 minutes for background ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Acceptance criteria with a real retry scenario
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="nf"&gt;Given &lt;/span&gt;a customer with a Visa ending 4242 and a recurring $29 subscription
&lt;span class="nf"&gt;When &lt;/span&gt;the renewal charge returns card_declined / insufficient_funds
&lt;span class="nf"&gt;Then &lt;/span&gt;the payment is marked past_due
  &lt;span class="nf"&gt;And &lt;/span&gt;attempt 2 is scheduled for +3 days with the same payment method
  &lt;span class="nf"&gt;And &lt;/span&gt;no email is sent on this attempt
  &lt;span class="nf"&gt;And &lt;/span&gt;the customer retains access until the grace period expires

&lt;span class="nf"&gt;Given &lt;/span&gt;the client receives a connection timeout on charge creation
&lt;span class="nf"&gt;When &lt;/span&gt;the client retries within 24 hours
&lt;span class="nf"&gt;Then &lt;/span&gt;it reuses the original Idempotency-Key
  &lt;span class="nf"&gt;And &lt;/span&gt;the processor returns the original charge, not a duplicate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Three metrics the spec has to name
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Authorization rate&lt;/strong&gt; broken down by BIN range and card scheme.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decline-reason distribution&lt;/strong&gt; with the processor's raw &lt;code&gt;decline_code&lt;/code&gt; preserved — not bucketed into "declined".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3DS drop-off&lt;/strong&gt;: challenges initiated vs challenges completed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Miss any of these and you're flying blind the first time a single issuer changes its risk model and tanks your approval rate overnight. I also require a dashboard for &lt;strong&gt;webhook lag&lt;/strong&gt; — the gap between the processor's event timestamp and your ingestion timestamp. A growing lag is usually the earliest signal that the payment pipeline is about to page someone.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;A payment spec is not "describe the charge endpoint." It's a &lt;strong&gt;failure-handling document with a small happy path attached.&lt;/strong&gt; Get the taxonomy right, attach a retry rule to each row, treat 3DS as a branch instead of an error, and let the webhook be the source of truth.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is adapted from the full guide on &lt;a href="https://spec-coding.dev/blog/payment-workflow-spec-failure-and-retry-matrix" rel="noopener noreferrer"&gt;Spec Coding&lt;/a&gt;, where we maintain spec templates and browser-based generators for API contracts, database migrations, and payment workflows.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>payments</category>
      <category>api</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Why Your API Contract Breaks in Production (And How to Fix It in the Spec)</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Wed, 22 Apr 2026 08:11:48 +0000</pubDate>
      <link>https://dev.to/speccoding/why-your-api-contract-breaks-in-production-and-how-to-fix-it-in-the-spec-1kjh</link>
      <guid>https://dev.to/speccoding/why-your-api-contract-breaks-in-production-and-how-to-fix-it-in-the-spec-1kjh</guid>
      <description>&lt;p&gt;The most expensive API bugs I've seen weren't implementation bugs. They were contract bugs — cases where the producer and consumer had different beliefs about what the API promised, and nobody wrote it down before the code was shipped.&lt;/p&gt;

&lt;p&gt;Here's the pattern: a team builds an endpoint, the frontend and backend agree on the behavior in a Slack thread or a meeting, and the implementation proceeds. Six months later, a field gets renamed as part of a "minor cleanup," a consumer breaks silently, and the incident postmortem includes the sentence "we thought this was backward compatible."&lt;/p&gt;

&lt;p&gt;The root cause is almost never malice or carelessness. It's that the contract was implicit — shared understanding that lived in people's heads rather than in a document that both sides could point to.&lt;/p&gt;




&lt;h2&gt;
  
  
  What an API contract actually is
&lt;/h2&gt;

&lt;p&gt;An API contract is the explicit agreement between a producer and its consumers about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What fields exist, what they're named, and what types they carry&lt;/li&gt;
&lt;li&gt;What the valid inputs are (required vs. optional, validation rules)&lt;/li&gt;
&lt;li&gt;What error codes mean and when each one is returned&lt;/li&gt;
&lt;li&gt;What "success" looks like at the HTTP level and the payload level&lt;/li&gt;
&lt;li&gt;What the retry behavior is — is this operation idempotent?&lt;/li&gt;
&lt;li&gt;What the version policy is — when does a change become breaking, and how much notice will consumers get?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most teams document some of these things. The ones that cause incidents are usually the last three.&lt;/p&gt;




&lt;h2&gt;
  
  
  The three contract decisions teams skip
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Error taxonomy
&lt;/h3&gt;

&lt;p&gt;When an operation fails, what does the consumer do next?&lt;/p&gt;

&lt;p&gt;If your API returns &lt;code&gt;400&lt;/code&gt; for validation errors and &lt;code&gt;400&lt;/code&gt; for business-rule rejections, the consumer can't distinguish between "fix your input" and "this action is blocked for business reasons." They'll implement retry logic that retries unretryable errors, or they'll show the wrong message to the user.&lt;/p&gt;

&lt;p&gt;Before implementation, the spec should define:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;400 Bad Request    — invalid input; client should fix and retry
409 Conflict       — valid input, business rule blocks; client should not retry
422 Unprocessable  — valid input, data state prevents action; may resolve over time
503 Unavailable    — transient; client should retry with backoff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a design decision. Make it in the spec, not in the error handler.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Idempotency
&lt;/h3&gt;

&lt;p&gt;Can the consumer safely retry this request if they don't get a response?&lt;/p&gt;

&lt;p&gt;Payment endpoints, state-mutation endpoints, and any operation with side effects need an explicit answer. "The backend should handle it" is not an answer — it's a deferred argument.&lt;/p&gt;

&lt;p&gt;Spec-first means writing this down before implementation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /orders
Idempotency: Yes
Key: X-Idempotency-Key header (UUID, client-generated)
Window: 24 hours — duplicate requests within 24h return the original response
After window: treated as new request
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the consumer knows what to send, the backend knows what to store, and QA knows what to test.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Breaking change definition
&lt;/h3&gt;

&lt;p&gt;Before you ship your first external-facing version, agree on what counts as breaking:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Breaking (requires version bump, advance notice):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Removing a field&lt;/li&gt;
&lt;li&gt;Renaming a field&lt;/li&gt;
&lt;li&gt;Changing a field's type&lt;/li&gt;
&lt;li&gt;Changing the meaning of an existing status code&lt;/li&gt;
&lt;li&gt;Making an optional field required&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Non-breaking (can ship without version bump):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Adding a new optional field&lt;/li&gt;
&lt;li&gt;Adding a new endpoint&lt;/li&gt;
&lt;li&gt;Adding a new enum value (if consumers are built to handle unknown values)&lt;/li&gt;
&lt;li&gt;Bug fixes that bring behavior in line with documented behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Document this. When "minor cleanup" happens six months from now, the engineer making the change has a checklist, not a judgment call.&lt;/p&gt;




&lt;h2&gt;
  
  
  The OpenAPI to CI pipeline
&lt;/h2&gt;

&lt;p&gt;Once your contract is written, the best way to protect it is to make breaking changes fail in CI before they reach a reviewer.&lt;/p&gt;

&lt;p&gt;The minimal setup:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Generate OpenAPI spec from your implementation&lt;/strong&gt; (or write it first and validate against it)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run &lt;code&gt;openapi-diff&lt;/code&gt; or &lt;code&gt;oasdiff&lt;/code&gt; in CI&lt;/strong&gt; — fails the build if a breaking change is detected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Run contract tests against a real consumer&lt;/strong&gt; — Pact or equivalent; the consumer's expectations are versioned alongside the producer's spec&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This doesn't prevent all contract breaks. It prevents the ones that nobody caught because they were in a field rename that "obviously" wouldn't affect anything.&lt;/p&gt;




&lt;h2&gt;
  
  
  The spec-first version of this workflow
&lt;/h2&gt;

&lt;p&gt;The difference between spec-first API development and regular API development isn't the tools. It's the order:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regular:&lt;/strong&gt; Build → Document → Ship → Break consumer → Fix&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spec-first:&lt;/strong&gt; Define contract → Review contract → Build to contract → Validate in CI → Ship&lt;/p&gt;

&lt;p&gt;The contract review is where most of the value lives. It's the moment when the consumer team can say "we rely on that field being optional" before the producer team has already built it as required.&lt;/p&gt;

&lt;p&gt;That conversation is almost free before implementation. It's expensive after deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  A minimal API contract template
&lt;/h2&gt;

&lt;p&gt;If you don't have a spec process yet, start with this for every new endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Endpoint: POST /v1/[resource]
Purpose: [one sentence]

Request:
  Required fields: [list]
  Optional fields: [list]
  Validation rules: [list]

Response (success):
  Status: 201
  Fields returned: [list]

Response (errors):
  400: [when, what to tell the consumer]
  409: [when, what to tell the consumer]
  503: [when, consumer should retry with backoff]

Idempotency: [yes/no] — if yes, key mechanism and window
Breaking change policy: [link to shared definition]
Version: v1 — changes that break the above require a new version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the minimum. It takes 15 minutes to write. The alternative is a 2am incident.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I write about spec-first delivery and API contracts at &lt;a href="https://spec-coding.dev" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;. The &lt;a href="https://spec-coding.dev/guides/api-contract-checklist" rel="noopener noreferrer"&gt;API Contract Checklist&lt;/a&gt; covers the full pre-release review process.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>programming</category>
      <category>webdev</category>
      <category>architecture</category>
    </item>
    <item>
      <title>I Shipped Fewer Bugs After I Started Writing Specs. Heres the Framework</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Wed, 22 Apr 2026 08:11:29 +0000</pubDate>
      <link>https://dev.to/speccoding/i-shipped-fewer-bugs-after-i-started-writing-specs-heres-the-framework-2p4f</link>
      <guid>https://dev.to/speccoding/i-shipped-fewer-bugs-after-i-started-writing-specs-heres-the-framework-2p4f</guid>
      <description>&lt;h1&gt;
  
  
  I Shipped Fewer Bugs After I Started Writing Specs. Here's the Framework
&lt;/h1&gt;

&lt;p&gt;Last March, we had a billing incident that cost us three days. A rounding error in our invoice calculation was silently overcharging customers by fractions of a cent. Not enough for anyone to notice on a single invoice. But multiplied across thousands of transactions over two months, it added up. We had to issue refunds, write a post-mortem, and rebuild trust with a handful of enterprise customers who did the math.&lt;/p&gt;

&lt;p&gt;The fix took 20 minutes. The actual bug was a &lt;code&gt;Math.floor()&lt;/code&gt; where we needed &lt;code&gt;Math.round()&lt;/code&gt; with banker's rounding.&lt;/p&gt;

&lt;p&gt;Here's what stung: if anyone had written down "use banker's rounding for currency calculations" before writing the code, this never would have shipped. The developer who wrote it didn't know about our rounding convention. The reviewer didn't think to check. Nobody was wrong. The requirement just lived in one person's head and never made it to the keyboard.&lt;/p&gt;

&lt;p&gt;That incident changed how our team builds software.&lt;/p&gt;

&lt;h2&gt;
  
  
  What spec-first actually means
&lt;/h2&gt;

&lt;p&gt;Spec-first development is simple: before you write code, you write a short document describing what you're building, what you're not building, and how you'll know it works.&lt;/p&gt;

&lt;p&gt;It's not waterfall. It's not Big Design Up Front. It's 10 minutes of structured thinking in a text file before you open your editor. Think of it as a conversation with your future self and your reviewers, except you have it before you've sunk 8 hours into an implementation.&lt;/p&gt;

&lt;p&gt;The spec travels with the PR. It's a living artifact, not a bureaucratic gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real example: CRM contact deduplication
&lt;/h2&gt;

&lt;p&gt;One of our team members picked up a ticket to build contact deduplication for our internal CRM. On the surface, it seemed straightforward: find duplicate contacts, merge them. Here's the abbreviated spec she wrote before starting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Goal&lt;/span&gt;
Detect and merge duplicate contacts in the CRM based on email
and phone number matching.

&lt;span class="gu"&gt;## Non-goals&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Fuzzy name matching (phase 2)
&lt;span class="p"&gt;-&lt;/span&gt; Automated merging without user confirmation
&lt;span class="p"&gt;-&lt;/span&gt; Deduplication of company records

&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Two contacts with the same email (case-insensitive) are flagged as duplicates
&lt;span class="p"&gt;-&lt;/span&gt; Two contacts with the same phone number (normalized to E.164) are flagged
&lt;span class="p"&gt;-&lt;/span&gt; User sees a side-by-side comparison and chooses which fields to keep
&lt;span class="p"&gt;-&lt;/span&gt; Merge preserves all activity history from both records
&lt;span class="p"&gt;-&lt;/span&gt; Merge is reversible for 30 days

&lt;span class="gu"&gt;## Edge Cases&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Contact A has email X, Contact B has email X and email Y.
  What happens to email Y after merge?
&lt;span class="p"&gt;-&lt;/span&gt; Three-way duplicates: A matches B, B matches C, but A doesn't match C
&lt;span class="p"&gt;-&lt;/span&gt; Contact with 500+ activities: does the merge UI choke?
&lt;span class="p"&gt;-&lt;/span&gt; Merged contact is referenced in active automation workflows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That spec took about 12 minutes to write. But look at what it caught before a single line of code was written:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The three-way duplicate problem.&lt;/strong&gt; Without the spec, the developer would have built a pairwise merge and discovered the transitive matching issue during QA. Or worse, in production when a user tried to merge a chain of duplicates and got inconsistent results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The automation workflow reference.&lt;/strong&gt; Contacts linked to active automations needed special handling. That requirement wasn't in the ticket. It came out of the developer thinking through edge cases while writing the spec. Without it, merging a contact mid-automation would have broken running workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The reversibility requirement.&lt;/strong&gt; The original ticket said nothing about undo. But when you write "acceptance criteria," you start thinking about what happens when things go wrong. Thirty days of reversibility turned out to be a key requirement from the customer success team, and the developer discovered this during a quick spec review before starting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The template
&lt;/h2&gt;

&lt;p&gt;After six months of iteration, our team settled on a lightweight template. Here's the version we use for any feature that takes more than a couple of hours:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Feature: [Name]&lt;/span&gt;
&lt;span class="gs"&gt;**Author:**&lt;/span&gt; [Your name]
&lt;span class="gs"&gt;**Date:**&lt;/span&gt; [Today]
&lt;span class="gs"&gt;**Status:**&lt;/span&gt; Draft / In Review / Approved

&lt;span class="gu"&gt;## Goal&lt;/span&gt;
One paragraph. What are we building and why?

&lt;span class="gu"&gt;## Non-goals&lt;/span&gt;
What are we explicitly NOT doing? (This prevents scope creep.)

&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;
Numbered list. How do we know this is done?

&lt;span class="gu"&gt;## Edge Cases&lt;/span&gt;
Bullet list. What weird inputs, states, or timing issues could break this?

&lt;span class="gu"&gt;## Rollout&lt;/span&gt;
How does this get to production? Feature flag? Percentage rollout?
Who monitors it?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Five sections. One page. The non-goals section alone has saved us more rework than any other practice we've adopted. When someone suggests "hey, we should also handle X" during implementation, we check the non-goals list. If X is there, the answer is "not this PR." If it's not there and it should be, we update the spec and have that conversation before writing more code.&lt;/p&gt;

&lt;p&gt;I've published &lt;a href="https://spec-coding.dev/guides/spec-template-examples" rel="noopener noreferrer"&gt;the full template with examples&lt;/a&gt; if you want to grab it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six months of results
&lt;/h2&gt;

&lt;p&gt;We started using specs consistently in October 2025 across a team of eight engineers. Here's what changed by April 2026:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P1 incidents dropped from ~3/month to ~1/month.&lt;/strong&gt; Not all of that is attributable to specs. We also improved our CI pipeline. But the incidents we did have were operational (infrastructure, scaling) rather than "we built the wrong thing" or "we missed an edge case."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review cycles got shorter.&lt;/strong&gt; Average time from PR open to merge went from 2.1 days to 1.4 days. Reviewers stopped asking "what is this supposed to do?" because the spec was right there. Reviews focused on implementation quality instead of requirements discovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Junior engineers ramped faster.&lt;/strong&gt; Two new hires in January were writing production specs within their second week. The specs gave them a structured way to demonstrate understanding before writing code, and gave senior engineers a concrete artifact to review instead of trying to evaluate whether someone "gets it" from a Slack conversation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scope creep became visible.&lt;/strong&gt; When a feature grew beyond its original spec, we could point to the document and say "this is new scope. Do we want to update the spec or save it for a follow-up?" Before specs, scope crept silently until the PR was twice the expected size.&lt;/p&gt;

&lt;p&gt;The biggest surprise: specs made estimation better. When you force yourself to list acceptance criteria and edge cases upfront, you have a much clearer picture of the actual work. Our sprint velocity predictions got noticeably more accurate, not because we got better at estimating, but because we got better at understanding what we were estimating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;If you've never written a spec before, don't try to change your whole team overnight. Start with yourself, on one feature. Pick something you're about to build that has at least a couple of moving parts. Spend 10 minutes filling in the template above. Share it with whoever would review your PR and ask: "Does this match your understanding?"&lt;/p&gt;

&lt;p&gt;That one conversation will tell you whether spec-first development is worth adopting more broadly. In my experience, it always is.&lt;/p&gt;

&lt;p&gt;I've written a &lt;a href="https://spec-coding.dev/blog/what-is-spec-first-development-complete-guide" rel="noopener noreferrer"&gt;complete guide to spec-first development&lt;/a&gt; that covers the philosophy, the process, and the common objections. And the &lt;a href="https://spec-coding.dev/blog/how-to-write-technical-spec-template-guide" rel="noopener noreferrer"&gt;template is free to download&lt;/a&gt;. Take it, modify it, make it yours.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Daniel Marsh is a software engineer and the author of &lt;a href="https://spec-coding.dev" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;, a resource for teams adopting spec-first development.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>agile</category>
      <category>engineering</category>
      <category>career</category>
    </item>
    <item>
      <title>10 Software Spec Mistakes That Cause Production Incidents (With Fixes)</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Wed, 22 Apr 2026 08:10:58 +0000</pubDate>
      <link>https://dev.to/speccoding/10-software-spec-mistakes-that-cause-production-incidents-with-fixes-34lk</link>
      <guid>https://dev.to/speccoding/10-software-spec-mistakes-that-cause-production-incidents-with-fixes-34lk</guid>
      <description>&lt;p&gt;After 12 years in B2B SaaS — and too many postmortems — I've noticed that most production incidents trace back to a decision that wasn't made in the spec. Not a coding error. A specification gap.&lt;/p&gt;

&lt;p&gt;Here are the 10 mistakes I see most often, what the symptom looks like, and how to fix each one before implementation starts.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Acceptance criteria that can't be tested
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; QA closes the ticket as "passed" but the feature behaves differently than product expected.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mistake:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The user should see a confirmation message after submitting.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Given a valid form submission, when the user clicks Submit, then a green banner appears at the top of the page with the text "Your changes have been saved" and remains visible until the user navigates away or dismisses it.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Acceptance criteria are a contract between product and QA. If QA has to ask the author what "confirmation" means, the spec didn't do its job.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Non-goals that don't name anything
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; Engineering builds something adjacent to the spec because the boundary wasn't explicit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mistake:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Out of scope: internationalization and advanced settings.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Out of scope for this release: (1) translated UI strings — English only; (2) per-user notification preferences — all users receive the same defaults; (3) bulk operations — only single-record edits are supported. A reviewer can reject this change if any of these appear in the implementation.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The non-goal should be specific enough that a reviewer can use it to push back.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. No named decision owner
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; Scope creep during implementation because nobody was authorized to say "that's out."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Every spec should have one named person who can approve scope changes. Not a committee. One person. If that person is unavailable, a named backup.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. "Handled elsewhere" for failure paths
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; An edge case hits production and ops discovers there's no fallback, no log, and no rollback path.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mistake:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Error handling will be managed by the existing error middleware.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Name the specific failure modes and what happens in each:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If the payment processor times out after 10 seconds: return HTTP 503, log &lt;code&gt;payment.timeout&lt;/code&gt; with order ID, do NOT charge the card, surface "Payment unavailable — please try again" to the user. Do not retry automatically.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  5. Missing rollback definition
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; A deployment goes wrong and nobody agrees on what "rollback" means, so the incident runs 3x longer than it should.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Before implementation, answer: if this change is reverted, what exactly happens?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Code revert only?&lt;/li&gt;
&lt;li&gt;Config flag flip?&lt;/li&gt;
&lt;li&gt;Database migration that needs to be reversed?&lt;/li&gt;
&lt;li&gt;Data that needs to be repaired?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If rollback requires data repair, that repair script should be written before the feature ships.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Acceptance criteria that only cover the happy path
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; QA testing passes 100%, but 3 edge cases surface in production week one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; For every main-flow scenario, write at least one failure scenario and one boundary scenario.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight gherkin"&gt;&lt;code&gt;&lt;span class="err"&gt;Happy path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;Given &lt;/span&gt;a valid user, when they submit, then success.
&lt;span class="err"&gt;Failure path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;Given &lt;/span&gt;an expired session, when they submit, then redirect to login with the form state preserved in session.
&lt;span class="err"&gt;Boundary&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;Given &lt;/span&gt;a user with exactly 0 remaining credits, when they try to submit, then show the upgrade prompt before the form is processed.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  7. Ambiguous authorization rules
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; A user can access or modify something they shouldn't be able to. Or they can't do something they should be able to.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mistake:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Only authorized users can edit this record.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Edit access requires: (1) the user is the record owner, OR (2) the user has the &lt;code&gt;admin&lt;/code&gt; role in the same organization. Users with &lt;code&gt;viewer&lt;/code&gt; role can read but not edit. Requests from users outside the record's organization are rejected with 403, regardless of their role.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. No stop-loss threshold for the release
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; A feature causes elevated error rates after deployment. Nobody knows whether to roll back or wait, so the on-call engineer makes a judgment call at 1am.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Name the threshold before the release:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Roll back automatically if: error rate on &lt;code&gt;/checkout&lt;/code&gt; exceeds 2% over a 5-minute window, OR payment processor timeout rate exceeds 5%, OR any P0 alert fires within 2 hours of deployment. The on-call engineer does not need approval to roll back if these thresholds are hit.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  9. "Friendly" or "reasonable" as a spec term
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; Engineering and design interpret "friendly error message" differently. Or "reasonable performance" means 200ms to one person and 2 seconds to another.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Replace every vague term with a testable one.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"friendly" -&amp;gt; specific message text or message category&lt;/li&gt;
&lt;li&gt;"reasonable" -&amp;gt; explicit metric with threshold&lt;/li&gt;
&lt;li&gt;"fast" -&amp;gt; p95 latency under X ms at Y concurrent users&lt;/li&gt;
&lt;li&gt;"handled" -&amp;gt; specific behavior enumerated&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  10. Spec written after the code
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Symptom:&lt;/strong&gt; The spec reads like documentation of what was built, not a description of what should be built. Nobody reviewed it before implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; This one is structural, not a wording fix. The spec needs to exist — and be reviewed — before implementation starts. The review is where the expensive conversations happen cheaply.&lt;/p&gt;

&lt;p&gt;The spec's job is not to describe decisions after they're made. It's to force decisions before they're expensive.&lt;/p&gt;




&lt;h2&gt;
  
  
  The common thread
&lt;/h2&gt;

&lt;p&gt;Every mistake on this list has the same root cause: a decision that should have been made explicitly in the spec was left implicit, and the team discovered the gap somewhere more expensive — in review, in testing, or in production.&lt;/p&gt;

&lt;p&gt;Spec-first development doesn't mean longer documents. It means making the right decisions earlier, in writing, where they can be challenged before they're coded.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I publish practical spec-first guides and free templates at &lt;a href="https://spec-coding.dev" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;. The &lt;a href="https://spec-coding.dev/guides/spec-review-checklist" rel="noopener noreferrer"&gt;Spec Review Checklist&lt;/a&gt; covers the pre-implementation review in detail.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>productivity</category>
      <category>devops</category>
      <category>career</category>
    </item>
    <item>
      <title>I Shipped Fewer Bugs After I Started Writing Specs — Here's the Framework</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Wed, 01 Apr 2026 09:48:53 +0000</pubDate>
      <link>https://dev.to/speccoding/i-shipped-fewer-bugs-after-i-started-writing-specs-heres-the-framework-4ll4</link>
      <guid>https://dev.to/speccoding/i-shipped-fewer-bugs-after-i-started-writing-specs-heres-the-framework-4ll4</guid>
      <description>&lt;h1&gt;
  
  
  I Shipped Fewer Bugs After I Started Writing Specs — Here's the Framework
&lt;/h1&gt;

&lt;p&gt;Last March, we had a billing incident that cost us three days. A rounding error in our invoice calculation was silently overcharging customers by fractions of a cent. Not enough for anyone to notice on a single invoice — but multiplied across thousands of transactions over two months, it added up. We had to issue refunds, write a post-mortem, and rebuild trust with a handful of enterprise customers who did the math.&lt;/p&gt;

&lt;p&gt;The fix took 20 minutes. The actual bug was a &lt;code&gt;Math.floor()&lt;/code&gt; where we needed &lt;code&gt;Math.round()&lt;/code&gt; with banker's rounding.&lt;/p&gt;

&lt;p&gt;Here's what stung: if anyone had written down "use banker's rounding for currency calculations" before writing the code, this never would have shipped. The developer who wrote it didn't know about our rounding convention. The reviewer didn't think to check. Nobody was wrong — the requirement just lived in one person's head and never made it to the keyboard.&lt;/p&gt;

&lt;p&gt;That incident changed how our team builds software.&lt;/p&gt;

&lt;h2&gt;
  
  
  What spec-first actually means
&lt;/h2&gt;

&lt;p&gt;Spec-first development is simple: before you write code, you write a short document describing what you're building, what you're not building, and how you'll know it works.&lt;/p&gt;

&lt;p&gt;It's not waterfall. It's not Big Design Up Front. It's 10 minutes of structured thinking in a text file before you open your editor. Think of it as a conversation with your future self and your reviewers — except you have it before you've sunk 8 hours into an implementation.&lt;/p&gt;

&lt;p&gt;The spec travels with the PR. It's a living artifact, not a bureaucratic gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  A real example: CRM contact deduplication
&lt;/h2&gt;

&lt;p&gt;One of our team members picked up a ticket to build contact deduplication for our internal CRM. On the surface, it seemed straightforward: find duplicate contacts, merge them. Here's the abbreviated spec she wrote before starting:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Goal&lt;/span&gt;
Detect and merge duplicate contacts in the CRM based on email
and phone number matching.

&lt;span class="gu"&gt;## Non-goals&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Fuzzy name matching (phase 2)
&lt;span class="p"&gt;-&lt;/span&gt; Automated merging without user confirmation
&lt;span class="p"&gt;-&lt;/span&gt; Deduplication of company records

&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Two contacts with the same email (case-insensitive) are flagged as duplicates
&lt;span class="p"&gt;-&lt;/span&gt; Two contacts with the same phone number (normalized to E.164) are flagged
&lt;span class="p"&gt;-&lt;/span&gt; User sees a side-by-side comparison and chooses which fields to keep
&lt;span class="p"&gt;-&lt;/span&gt; Merge preserves all activity history from both records
&lt;span class="p"&gt;-&lt;/span&gt; Merge is reversible for 30 days

&lt;span class="gu"&gt;## Edge Cases&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Contact A has email X, Contact B has email X and email Y —
  what happens to email Y after merge?
&lt;span class="p"&gt;-&lt;/span&gt; Three-way duplicates: A matches B, B matches C, but A doesn't match C
&lt;span class="p"&gt;-&lt;/span&gt; Contact with 500+ activities — does the merge UI choke?
&lt;span class="p"&gt;-&lt;/span&gt; Merged contact is referenced in active automation workflows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That spec took about 12 minutes to write. But look at what it caught before a single line of code was written:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The three-way duplicate problem.&lt;/strong&gt; Without the spec, the developer would have built a pairwise merge and discovered the transitive matching issue during QA — or worse, in production when a user tried to merge a chain of duplicates and got inconsistent results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The automation workflow reference.&lt;/strong&gt; Contacts linked to active automations needed special handling. That requirement wasn't in the ticket. It came out of the developer thinking through edge cases while writing the spec. Without it, merging a contact mid-automation would have broken running workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The reversibility requirement.&lt;/strong&gt; The original ticket said nothing about undo. But when you write "acceptance criteria," you start thinking about what happens when things go wrong. Thirty days of reversibility turned out to be a key requirement from the customer success team — the developer discovered this during a quick spec review before starting.&lt;/p&gt;

&lt;h2&gt;
  
  
  The template
&lt;/h2&gt;

&lt;p&gt;After six months of iteration, our team settled on a lightweight template. Here's the version we use for any feature that takes more than a couple of hours:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Feature: [Name]&lt;/span&gt;
&lt;span class="gs"&gt;**Author:**&lt;/span&gt; [Your name]
&lt;span class="gs"&gt;**Date:**&lt;/span&gt; [Today]
&lt;span class="gs"&gt;**Status:**&lt;/span&gt; Draft / In Review / Approved

&lt;span class="gu"&gt;## Goal&lt;/span&gt;
One paragraph. What are we building and why?

&lt;span class="gu"&gt;## Non-goals&lt;/span&gt;
What are we explicitly NOT doing? (This prevents scope creep.)

&lt;span class="gu"&gt;## Acceptance Criteria&lt;/span&gt;
Numbered list. How do we know this is done?

&lt;span class="gu"&gt;## Edge Cases&lt;/span&gt;
Bullet list. What weird inputs, states, or timing issues could break this?

&lt;span class="gu"&gt;## Rollout&lt;/span&gt;
How does this get to production? Feature flag? Percentage rollout?
Who monitors it?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Five sections. One page. The non-goals section alone has saved us more rework than any other practice we've adopted. When someone suggests "hey, we should also handle X" during implementation, we check the non-goals list. If X is there, the answer is "not this PR." If it's not there and it should be, we update the spec and have that conversation before writing more code.&lt;/p&gt;

&lt;p&gt;I've published &lt;a href="https://spec-coding.dev/guides/spec-template-examples" rel="noopener noreferrer"&gt;the full template with examples&lt;/a&gt; if you want to grab it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six months of results
&lt;/h2&gt;

&lt;p&gt;We started using specs consistently in October 2025 across a team of eight engineers. Here's what changed by April 2026:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;P1 incidents dropped from ~3/month to ~1/month.&lt;/strong&gt; Not all of that is attributable to specs — we also improved our CI pipeline. But the incidents we did have were operational (infrastructure, scaling) rather than "we built the wrong thing" or "we missed an edge case."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review cycles got shorter.&lt;/strong&gt; Average time from PR open to merge went from 2.1 days to 1.4 days. Reviewers stopped asking "what is this supposed to do?" because the spec was right there. Reviews focused on implementation quality instead of requirements discovery.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Junior engineers ramped faster.&lt;/strong&gt; Two new hires in January were writing production specs within their second week. The specs gave them a structured way to demonstrate understanding before writing code — and gave senior engineers a concrete artifact to review instead of trying to evaluate whether someone "gets it" from a Slack conversation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scope creep became visible.&lt;/strong&gt; When a feature grew beyond its original spec, we could point to the document and say "this is new scope — do we want to update the spec or save it for a follow-up?" Before specs, scope crept silently until the PR was twice the expected size.&lt;/p&gt;

&lt;p&gt;The biggest surprise: specs made estimation better. When you force yourself to list acceptance criteria and edge cases upfront, you have a much clearer picture of the actual work. Our sprint velocity predictions got noticeably more accurate — not because we got better at estimating, but because we got better at understanding what we were estimating.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting started
&lt;/h2&gt;

&lt;p&gt;If you've never written a spec before, don't try to change your whole team overnight. Start with yourself, on one feature. Pick something you're about to build that has at least a couple of moving parts. Spend 10 minutes filling in the template above. Share it with whoever would review your PR and ask: "Does this match your understanding?"&lt;/p&gt;

&lt;p&gt;That one conversation will tell you whether spec-first development is worth adopting more broadly. In my experience, it always is.&lt;/p&gt;

&lt;p&gt;I've written a &lt;a href="https://spec-coding.dev/blog/what-is-spec-first-development-complete-guide" rel="noopener noreferrer"&gt;complete guide to spec-first development&lt;/a&gt; that covers the philosophy, the process, and the common objections. And the &lt;a href="https://spec-coding.dev/blog/software-spec-template-free-download" rel="noopener noreferrer"&gt;template is free to download&lt;/a&gt; — take it, modify it, make it yours.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Daniel Marsh is a software engineer and the author of &lt;a href="https://spec-coding.dev" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;, a resource for teams adopting spec-first development.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>codequality</category>
      <category>documentation</category>
      <category>productivity</category>
      <category>softwareengineering</category>
    </item>
    <item>
      <title>What Is Spec-First Development? (Complete Guide)</title>
      <dc:creator>guo king</dc:creator>
      <pubDate>Sat, 14 Mar 2026 02:43:23 +0000</pubDate>
      <link>https://dev.to/speccoding/what-is-spec-first-development-complete-guide-41nd</link>
      <guid>https://dev.to/speccoding/what-is-spec-first-development-complete-guide-41nd</guid>
      <description>&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://spec-coding.dev/blog/what-i&lt;br&gt;%0A%20%20s-spec-first-development-complete-guide" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Spec-first development becomes clearer when the team makes the &lt;strong&gt;hidden&lt;br&gt;
  decisions visible before coding starts&lt;/strong&gt;. This article focuses on those&lt;br&gt;
  decisions and why they matter in delivery.&lt;/p&gt;

&lt;p&gt;## 1. Why Teams Usually Get This Wrong&lt;/p&gt;

&lt;p&gt;The problem appears the moment more than one role depends on the answer.&lt;br&gt;
  Product wants speed, engineering wants boundaries, QA wants testability, and&lt;br&gt;
  operations wants something that can be rolled back without improvisation. If&lt;br&gt;
  the spec never resolves those tensions, the work moves downstream as rework.&lt;/p&gt;

&lt;p&gt;The real decision behind spec-first is &lt;strong&gt;ownership&lt;/strong&gt;. Who approves the&lt;br&gt;
  boundary? Who validates acceptance criteria? Who can stop a rollout? Without&lt;br&gt;
  those answers in writing, teams ship with blurred accountability.&lt;/p&gt;

&lt;p&gt;## 2. A Concrete Delivery Situation&lt;/p&gt;

&lt;p&gt;Test your spec against one realistic delivery path. Ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What can the reviewer &lt;strong&gt;reject&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;What can the tester &lt;strong&gt;verify&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;What can the operator &lt;strong&gt;roll back&lt;/strong&gt;?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If the document cannot answer those three angles, it is not ready. Stop asking&lt;br&gt;
   whether the team "understands the idea" and start asking whether the written&lt;br&gt;
  spec would survive handoff.&lt;/p&gt;

&lt;p&gt;## 3. What the Spec Needs to Say Out Loud&lt;/p&gt;

&lt;p&gt;A stronger spec is not longer by accident — it is more explicit exactly where&lt;br&gt;
  projects tend to drift. At minimum:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The intended outcome, explicit &lt;strong&gt;non-goals&lt;/strong&gt;, and the decision owner for
scope changes&lt;/li&gt;
&lt;li&gt;Acceptance criteria that can be judged &lt;strong&gt;pass or fail&lt;/strong&gt; without tribal
knowledge&lt;/li&gt;
&lt;li&gt;Failure behavior, fallback paths, and which logs or metrics will surface
them after release&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 4. Acceptance Criteria That Remove Guesswork&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Given the approved scope and dependencies
When the team executes the primary flow
Then success can be verified with a concrete result and observable evidence&lt;/li&gt;
&lt;li&gt;Given an exception, retry, or permission boundary is hit
When the system takes the fallback path
Then the user-facing behavior and operational response remain explicit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to force every team into the same wording. The goal is to&lt;br&gt;
  force decisions about &lt;strong&gt;input, trigger, and expected behavior&lt;/strong&gt; while there is&lt;br&gt;
   still time to change course cheaply.&lt;/p&gt;

&lt;p&gt;## 5. Review Questions Before Implementation Starts&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which decision would still be &lt;strong&gt;ambiguous&lt;/strong&gt; to a reviewer seeing the change
for the first time?&lt;/li&gt;
&lt;li&gt;Can QA derive the main-flow, failure-path, and regression cases &lt;strong&gt;without
interviewing the author&lt;/strong&gt;?&lt;/li&gt;
&lt;li&gt;If the release fails, does the document tell operations what to watch and
when to stop?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you cannot answer these without a side conversation, the draft is still&lt;br&gt;
  carrying unpriced uncertainty.&lt;/p&gt;

&lt;p&gt;## 6. Rollout, Monitoring, and Rollback&lt;/p&gt;

&lt;p&gt;Good specs do not stop at merge readiness. They tell the team how to release&lt;br&gt;
  safely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Stage to the &lt;strong&gt;smallest useful audience&lt;/strong&gt; first — don't treat review
approval as a substitute for runtime evidence&lt;/li&gt;
&lt;li&gt;Name the &lt;strong&gt;stop-loss threshold&lt;/strong&gt; before release: error rate, latency, data
mismatch, or override volume&lt;/li&gt;
&lt;li&gt;Record what rollback actually means: code revert, config switch, job pause,
or data repair&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 7. Common Mistakes&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Treating scope as obvious and leaving the real policy decision for
implementation review&lt;/li&gt;
&lt;li&gt;Hiding risky edge behavior behind vague phrases like &lt;em&gt;reasonable&lt;/em&gt;,
&lt;em&gt;friendly&lt;/em&gt;, or &lt;em&gt;handled elsewhere&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Publishing a document that sounds complete but never states how success or
rollback will be judged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;## 8. When to Go Deep vs. Stay Lightweight&lt;/p&gt;

&lt;p&gt;The right amount of detail is the amount that &lt;strong&gt;prevents expensive invention&lt;br&gt;
  later&lt;/strong&gt;. If a missing sentence would force engineering, QA, or operations to&lt;br&gt;
  guess — that sentence belongs in the spec.&lt;/p&gt;

&lt;p&gt;What matters is not document length. What matters is whether the document&lt;br&gt;
  prevents the team from rediscovering the same decision in implementation,&lt;br&gt;
  testing, and release review.&lt;/p&gt;

&lt;p&gt;## 9. Final Takeaway&lt;/p&gt;

&lt;p&gt;Spec-first development becomes easier once the spec names the risky decisions&lt;br&gt;
  early enough for someone to challenge them. That is the practical value:&lt;br&gt;
  &lt;strong&gt;less improvisation, fewer surprises in review, and cleaner evidence when it&lt;br&gt;
  is time to ship&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Daniel Marsh is a senior software engineer with 12 years of experience,&lt;br&gt;
  specializing in spec-first development. More articles at&lt;br&gt;
  &lt;a href="https://spec-coding.dev" rel="noopener noreferrer"&gt;spec-coding.dev&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>architecture</category>
      <category>programming</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
