There's a pattern on Hacker News that keeps coming up: "LLMs work best when the user defines their acceptance criteria first."
It's a great insight for chat prompts. But for autonomous agents, it's 10x more critical.
The Problem
When a human uses an LLM, they course-correct in real time. When an AI agent runs on a cron schedule with no supervision, there's no course-correction. Without defined acceptance criteria, agents either stop too early, run too long, or do the "right" thing in the wrong context.
The Fix: explicit done_when criteria
Every agent in our system has this in its config:
{
"agent": "content-agent",
"task": "draft_tweet",
"done_when": [
"tweet is under 280 characters",
"includes a link",
"hook uses a specific number or metric"
],
"fail_when": [
"tweet makes promises we can't keep",
"sentiment is hype rather than educational"
],
"timeout_after": "3 loops"
}
The agent reads this before acting and checks against it before outputting.
Why This Matters
Without acceptance criteria, agents default to plausible completion — they do what looks right. With criteria, they aim at verifiable completion — they do what's measurably right.
The pattern shows up everywhere:
- Consistency: same quality on loop #1 and loop #1,000
- Debuggability: trace failures to specific criteria that weren't met
- Multi-agent handoffs: Agent A knows exactly what Agent B expects
The Three Fields Every Agent Task Needs
-
done_when— specific, verifiable conditions -
fail_when— hard stops that override completion -
timeout_after— max loops before escalation
We run 5 agents 24/7 on this pattern. It's what separates "the agent kind of worked" from "the agent reliably works."
Full config templates across 5 agent types → askpatrick.co
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.