The Problem: Agents Guess When They Should Pause
AI agents fail on ambiguity. Not because they're dumb — because they're decisive. When a task could mean two things, the agent picks one and runs. It doesn't stop to ask. It doesn't flag the ambiguity. It just acts.
Sometimes it guesses right. Often it doesn't.
This is one of the most common failure patterns I see in production agent systems: a perfectly capable agent that takes confident action on an ambiguous instruction — and causes more work than if it had done nothing.
Why This Happens
Most agent configs tell the agent what to do. Few tell it when to stop and ask.
The implicit assumption is that the agent will recognize ambiguity and handle it gracefully. But agents are completion machines. Their default is to complete — to take the most plausible interpretation and act on it.
Without an explicit instruction to pause, the agent moves.
The One-Line Fix
Add this to your SOUL.md:
clarification_protocol: If task scope is ambiguous and the action is irreversible, write the ambiguity to outbox.json and stop.
That's it. One line. It does three things:
- Defines when to trigger: Ambiguity + irreversibility = pause
- Defines what to do: Write to outbox.json (not interrupt, not crash — document)
- Defines the outcome: Stop. Don't guess.
What Goes in outbox.json
{
"type": "clarification_needed",
"task": "Archive old project files",
"ambiguity": "'Old' could mean >30 days or >90 days. Both are plausible.",
"proposed_interpretations": [
"Archive files not modified in >30 days",
"Archive files not modified in >90 days"
],
"recommended": "Archive files not modified in >90 days (safer)",
"timestamp": "2026-03-08T08:30:00Z"
}
The agent documents what it knows, what it's uncertain about, and what it would do if forced to choose — then stops.
This is better than:
- Crashing with an error
- Guessing silently
- Sending an interrupt at 3 AM asking for clarification
The Irreversibility Test
Not all ambiguity requires a pause. The trigger is the combination of ambiguity + irreversibility.
Ask two questions:
- Could this instruction mean more than one thing?
- Is the action reversible?
If yes + no → pause and write to outbox.json.
If yes + yes → pick the safest interpretation, log it, proceed.
If no → proceed normally.
This keeps the agent moving on low-stakes decisions while protecting against the cases where a wrong guess causes real damage.
Common Irreversible Actions to Flag
- Sending external communications (emails, messages, posts)
- Deleting or archiving files without explicit scope
- Making financial transactions of any kind
- Publishing content to public channels
- Modifying configuration files that affect other systems
- Adding or removing user permissions
For these, ambiguity should always be a pause.
The Compounding Benefit: An Audit Trail
Every time your agent writes a clarification to outbox.json, you get a free audit trail of:
- Which task descriptions were unclear
- Which ambiguities came up repeatedly
- Which interpretations the agent would have chosen
After two weeks, you'll know exactly which parts of your workflow need clearer instructions. It's a diagnosis tool as much as a safety tool.
Review your outbox.json entries weekly. The recurring ambiguities are where your agent config needs work.
Why "Stop" Is Better Than "Ask"
You might think: why not have the agent ask for clarification in real time?
Two reasons:
Timing: Most agents run on cron schedules or in the background. Real-time clarification means blocking the task until someone responds — often hours later.
Interruption cost: Getting interrupted to resolve an ambiguity breaks flow. Reading an outbox.json entry at your next review cycle doesn't.
The outbox pattern lets you batch your clarifications. One review window, multiple resolved ambiguities, agent resumes.
Full Implementation Pattern
In SOUL.md:
## Clarification Protocol
Before acting on any ambiguous instruction:
1. Check: could this mean more than one thing?
2. Check: is the action irreversible?
3. If both yes: write to outbox.json and stop.
4. If ambiguous but reversible: pick safest interpretation, log decision, proceed.
5. Never guess on irreversible actions. Never.
In your task runner:
import json
from datetime import datetime
def flag_ambiguity(task, ambiguity, interpretations, recommended):
entry = {
"type": "clarification_needed",
"task": task,
"ambiguity": ambiguity,
"proposed_interpretations": interpretations,
"recommended": recommended,
"timestamp": datetime.utcnow().isoformat() + "Z"
}
try:
with open("outbox.json", "r") as f:
outbox = json.load(f)
except FileNotFoundError:
outbox = []
outbox.append(entry)
with open("outbox.json", "w") as f:
json.dump(outbox, f, indent=2)
return False # Signal: stop processing
The Bottom Line
Ambiguity is not a model failure. It's a workflow failure. Your instructions weren't clear enough, or the task was genuinely underdetermined.
The clarification protocol doesn't fix bad instructions — but it stops your agent from making that problem worse by guessing on irreversible actions.
One line in SOUL.md. Pause instead of guess. That's the whole thing.
If you're building AI agent systems and want battle-tested configs for patterns like this, askpatrick.co has a library of real-world agent configurations updated nightly.
Top comments (0)