guo king

Posted on Jun 16 • Originally published at spec-coding.dev

Acceptance Criteria Your QA Can Run Without Asking You Anything (6 Copyable Examples)

#qa #testing #agile #productivity

The acceptance criterion I reach for most is not the happy path. It's the duplicate action: double-clicked submit buttons, repeated webhook deliveries, retried payment calls, imported rows seen twice.

That one scenario exposes more vague thinking than a dozen success cases:

Given a user submits the same request twice within 2 seconds
When both requests reach the server
Then exactly one state change is recorded
  And the second response returns the first operation id
  And the audit log marks it as a replay

If your team can't answer what happens in that scenario, the spec isn't done. Here's the format that forces the answer, plus six examples you can copy.

The anatomy of a criterion that works

Every Given/When/Then has three parts with specific jobs:

Given — the precondition. What must be true before the action occurs?
When — the trigger. What specific event causes the behavior?
Then — the observable outcome. How does someone testing this know it worked?

The quality bar: QA can turn it into a test without asking the author what they meant.

"The system responds quickly" fails the bar — quickly could be 200ms or 5 seconds depending on who's reading. "The API responds within 500ms at p95" passes.

Bad:

"The login should work correctly and handle errors gracefully."

Good:

Given a registered user is on the login page
When the user enters a valid email and correct password and clicks "Sign in"
Then the user is redirected to the dashboard within 2 seconds;
  the session cookie is set with HttpOnly and Secure flags;
  the last_login_at timestamp is updated in the users table.

Three observable outcomes. Zero interpretation needed.

Example 1: Account lockout

Given a registered user with email "jane@example.com"
  and the account is not currently locked
When the user submits an incorrect password 3 times consecutively
  within a 10-minute window
Then the account is locked for 15 minutes;
  subsequent login attempts return HTTP 429 with body
  {"error": "account_locked", "retry_after_seconds": 900};
  the form displays "Account locked. Try again in 15 minutes.";
  a login_lockout event is written to the audit_log table;
  after 15 minutes the account unlocks automatically.

Notice what's pinned down: the window (10 min), the count (3), the duration (15 min), the status code, the error body, the UI copy, and the audit trail.

Example 2: Session expiry

Given a user is logged in
  and the session timeout is configured to 30 minutes of inactivity
When the user performs no actions for 30 consecutive minutes
Then the next request returns HTTP 401;
  the session cookie is cleared server-side;
  the user is redirected to /login?reason=session_expired;
  the page displays "Your session expired due to inactivity.";
  unsaved client form data is NOT recoverable
  (known limitation, documented in the UI).

That last line is the underrated move: writing down what the system deliberately does NOT do. It converts a future bug report into a documented decision.

Example 3: Checkout with insufficient stock

Given a customer has 3 units of SKU-2087 in their cart
  and available_quantity for SKU-2087 is now 1
When the customer clicks "Place Order"
Then the order is NOT placed;
  the page shows "Some items are no longer available
  in the requested quantity.";
  the line item shows "Only 1 available — please update quantity.";
  the customer can update to 1 and retry;
  if available_quantity drops to 0 between page load and submit,
  the message reads "SKU-2087 is out of stock"
  and the line item shows a "Remove" button only;
  no payment is captured in ANY insufficient-stock scenario.

The two-tier degradation (low stock vs zero stock) and the final invariant are what make this executable instead of decorative.

Example 4: Rate limiting

Given the rate limit for the "standard" plan is 100 requests/minute
  and key "key_abc123" has made 100 requests in the current window
When the consumer sends request #101 in the same window
Then the response is 429 with body
  {"error": "rate_limit_exceeded", "retry_after_seconds": <remaining>};
  headers include X-RateLimit-Limit, X-RateLimit-Remaining,
  X-RateLimit-Reset, and Retry-After;
  successful responses ALSO carry the X-RateLimit-* headers;
  limits are scoped per API key, not per IP;
  429 responses are NOT counted against the next window.

The last two lines settle the arguments your team would otherwise have in the PR thread.

Example 5: CSV import with duplicates

Given an admin uploads a 10,000-row CSV to /admin/contacts/import
  and 200 rows have emails that already exist
When the import job processes the file
Then 9,800 contacts are created and 200 are skipped (not updated);
  the result page shows Total/Created/Skipped/Errors counts;
  a CSV of skipped rows is downloadable with row_number, email, reason;
  email comparison is case-insensitive;
  a malformed row counts under "Errors" and the rest continue;
  the import is atomic per-row, not per-file — an interruption
  at row 5,000 leaves the first 5,000 committed.

Example 6: Notification retry

Given the service sends an email via the SMTP provider
When the provider returns a transient error (timeout, 5xx, DNS)
Then the notification enters a retry queue with exponential backoff:
  1 min → 5 min → 15 min → 60 min;
  each attempt is logged with attempt_number and next_retry_at;
  after the 4th failure, status = "permanently_failed"
  and an alert fires with the full error history;
  a PERMANENT error (550 mailbox not found) gets NO retries —
  status fails immediately and the user's email_verified flag
  is set to false.

The transient/permanent split is the difference between a retry policy and a retry loop.

The 5 mistakes that make criteria useless

Too vague — "the system works correctly." Untestable.
Implementation instead of behavior — "uses Redis cache." The criterion should survive a tech-stack change.
Multiple behaviors in one criterion — if it has three Whens, it's three criteria.
Happy path only — the error paths are where the incidents live.
Untestable performance claims — "fast" is not a number. "p95 < 500ms" is.

Steal this blank template

## Acceptance Criteria

### Happy path
Given <precondition>
When <action>
Then <observable outcome>; <outcome>; <outcome>.

### Error handling
Given <precondition>
When <failure trigger>
Then <error response with exact status/message>;
  <state that must NOT change>;
  <what gets logged/alerted>.

### Edge cases
Given <boundary condition: duplicate, concurrent, empty, max>
When <action>
Then <deterministic outcome>.

### Performance
Given <load condition>
When <action>
Then <metric with number and percentile>.

This is a condensed cut of the full guide — all 20 examples across auth, e-commerce, APIs, data processing, and notifications — on Spec Coding. There's also a free Gherkin generator if you want the format scaffolded for you.

DEV Community