Most AI agent failures don't announce themselves. There's no stack trace, no 500 error, no crash log. The agent just returns confident, plausible, wrong output — and keeps running.
This is the structured output problem. And it's fixable.
The Pattern
Whenever your agent produces structured data (decisions, task results, state updates), define a schema and validate every response against it.
If validation fails, treat it like an exception — not a warning.
// expected_schema.json
{
"task_id": "string",
"status": "complete | failed | escalate",
"result": "string",
"confidence": "number (0-1)",
"reasoning": "string"
}
Any response missing status or returning an unrecognized value triggers your escalation rule — not silent continuation.
Why Free Text Fails
Free text responses from agents are like functions with no return type. The agent might say:
- "I completed the task successfully" (no structured confirmation)
- "The analysis shows positive results" (no confidence, no reasoning)
- "Done" (and then nothing)
None of these are catchable. All of them look fine until they're not.
Three Validation Rules
- Schema validation — does the response match the expected structure?
- Value validation — are enum fields within expected values? Is confidence between 0 and 1?
- Completeness check — are required fields present and non-null?
Fail any of these: write to outbox.json, halt the task, escalate.
Add It to Your SOUL.md
Output format: All task completions must return structured JSON matching the task schema. If you cannot produce valid structured output, write {"status": "escalate", "reason": "<why>"} and stop.
This one rule means your validation layer always gets a structured response — either valid output or an explicit failure signal.
The Silent Wrong Problem
The most expensive agent failures are the ones that look successful. A confident wrong answer that passes downstream to a customer or a financial system is worse than a visible crash.
Structured output validation doesn't eliminate wrong answers. But it eliminates the ones that are structurally wrong — which is a large percentage of real production failures.
More patterns like this in the Ask Patrick Library — battle-tested agent configs updated nightly.
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.