Nova Elvaris

Posted on Mar 21

Acceptance Criteria for AI Tasks: A Simple Template That Cuts Rework

#ai #llm #productivity #softwareengineering

Acceptance Criteria for AI Tasks: A Simple Template That Cuts Rework

A surprisingly large amount of AI frustration comes from the same sentence:

This is not quite what I meant.

That sentence usually means the model failed.

But sometimes it means you never defined success clearly enough to review the output quickly.

That is what acceptance criteria fix.

Instead of asking the model to “write a summary,” “make a plan,” or “review this code,” you define what a good result must include before generation starts.

That sounds obvious.

It is also one of the fastest ways to reduce rework.

What acceptance criteria do for AI work

Acceptance criteria turn taste into checks.

They answer questions like:

what must be present?
what must not be present?
how will we know this is usable?
what level of detail is expected?
what counts as incomplete?

Without them, review becomes vague.

With them, review becomes faster because you are checking a result against a known target.

The problem with loose prompts

Consider this prompt:

Write an implementation plan for this feature.

That can produce all kinds of plausible outputs:

a short summary
a project plan with no risks
a detailed architecture proposal
a task list without testing
a wall of prose nobody wants to execute

Now compare it to this:

Write an implementation plan for this feature.

Acceptance criteria:
- include scope summary in 3 bullets or fewer
- identify dependencies and open questions
- include numbered implementation steps
- include a test plan with unit, integration, and manual checks
- highlight risks that could delay delivery
- keep total length under 700 words

That second version is much easier to evaluate.

A practical template

This is the template I use most often:

Acceptance criteria:
- must include:
- must avoid:
- output format:
- verification checks:
- done when:

Let’s unpack each one.

Must include

List the essential ingredients.

Examples:

at least 3 concrete examples
explicit assumptions
rollback considerations
citations or source links
test cases
next actions

Must avoid

This is just as important.

Examples:

no generic filler
no invented facts
no mention of internal chain-of-thought
no large rewrites outside the requested scope
no markdown tables if the target platform hates them

Output format

Formatting errors create more friction than people admit.

Be explicit:

markdown with H2 headings
JSON matching this schema
bullet list with severity labels
email draft with subject and body
5-item numbered plan

Verification checks

These are cheap checks you can do in seconds.

Examples:

every issue must have evidence
every recommendation must map to a stated risk
every code change must include a test note
every claim must be supported by provided context

Done when

This is the final threshold.

Examples:

ready to paste into GitHub
safe to send after one human review
executable by another engineer without follow-up questions
publish-ready for Dev.to

Example: content drafting

Weak prompt:

Write a Dev.to post about debugging prompts.

Stronger version:

Write a Dev.to article about debugging prompts.

Acceptance criteria:
- must include a concrete debugging checklist
- must explain at least 3 common failure modes
- must include one code or prompt example
- must avoid hype and generic "AI will change everything" filler
- format as publish-ready markdown with frontmatter
- done when a developer could publish it with only light copy edits

Notice what happened.

The model now knows both the topic and the bar.

Example: AI-assisted coding

Weak prompt:

Fix this bug.

Better:

Fix this bug.

Acceptance criteria:
- identify the likely root cause before proposing the patch
- keep the change minimal and local
- include a failing-test-first strategy if practical
- explain any assumptions about edge cases
- do not modify unrelated files
- done when the patch and test plan are both clear enough for review

That turns a vague repair request into a safer engineering task.

Why this matters more than prompt cleverness

A lot of prompting advice focuses on phrasing tricks.

Some of those help.

But in practical workflows, acceptance criteria usually matter more because they shape the review loop.

A fancy prompt can still produce a hard-to-judge answer.

A plain prompt with clear criteria often produces something much easier to accept or reject.

That is a better operating model.

Turn criteria into a checklist

If the task repeats, make the criteria reusable.

For example:

const articleCriteria = [
  "strong intro with a clear problem",
  "at least one concrete example",
  "actionable steps, not just theory",
  "no AI self-reference",
  "publish-ready markdown"
];

Or as a markdown snippet:

## Standard article checks
- problem is clear in the first 5 lines
- at least one example appears before the midpoint
- conclusion gives an immediate next step
- title is concrete, not abstract

If you repeat the same work often, these little checklists compound fast.

Common mistakes

Mistake 1: criteria that are still subjective

“Make it great” is not a criterion.

“Include 3 concrete examples and avoid generic claims” is.

Mistake 2: too many criteria

If you hand the model 25 rules, you may create a different failure mode: compliance overload.

Use the smallest set that protects quality.

Mistake 3: criteria that conflict

Example:

be highly detailed
keep it under 250 words

If one rule matters more, say so.

Mistake 4: no negative criteria

Teams often specify what they want but forget to say what to avoid.

That is how filler sneaks back in.

A short checklist you can steal

For many AI tasks, this is enough:

Acceptance criteria:
- must include the requested deliverable in a clearly reviewable format
- must state assumptions where context is incomplete
- must avoid invented facts and unrelated expansion
- must include at least one concrete example or test
- done when a reviewer can approve or reject it in under 2 minutes

That last line is my favorite.

If the result takes forever to evaluate, the task was not framed sharply enough.

The big shift

Acceptance criteria do not make AI perfect.

They do something more useful: they make success legible.

That means less vague disappointment, less back-and-forth, and faster review.

If you want more reliable AI outputs, do not just spend time on the prompt body.

Spend time defining what “good enough to ship” actually means.

DEV Community

Acceptance Criteria for AI Tasks: A Simple Template That Cuts Rework

Acceptance Criteria for AI Tasks: A Simple Template That Cuts Rework

What acceptance criteria do for AI work

The problem with loose prompts

A practical template

Must include

Must avoid

Output format

Verification checks

Done when

Example: content drafting

Example: AI-assisted coding

Why this matters more than prompt cleverness

Turn criteria into a checklist

Common mistakes

Mistake 1: criteria that are still subjective

Mistake 2: too many criteria

Mistake 3: criteria that conflict

Mistake 4: no negative criteria

A short checklist you can steal

The big shift

Top comments (0)