I had a clean multi-agent flow: Router -> Planner -> Tool Worker -> Finalizer.
It looked rock-solid in dev.
Then production hit.
One reply came back as: Sure! { "route": "PLAN", } (yep, trailing comma).
My parser crashed, the planner got junk, and the finalizer confidently shipped nonsense.
The worst part? The model thought it succeeded.
That day I learned: clean architecture doesn’t matter if your outputs aren’t enforceable.
If you can’t trust the shape of data, your agents are basically improvising.
Problem framing: why this fails in production
LLMs are helpful by default, not compliant by default.
In a multi-agent system, output drift usually shows up as:
- Extra text wrapped around JSON (“Here you go: …”)
- New keys you didn’t ask for (“confidence”, “notes”, “explanation”)
- Type drift (string instead of array, number instead of string)
- Half-JSON (missing braces, trailing commas)
- Silent failure (agent returns plausible prose when it should return
"status":"unknown")
Once one agent returns malformed structure, the rest of the pipeline turns into a hallucination relay.
Definitions: output format enforcement in 4 crisp parts
1) Schema
A single source of truth for keys + types + allowed values.
2) Contract
“Return ONLY valid JSON matching schema. No extra keys. No prose.”
3) Validation gate
A deterministic check that runs before downstream logic consumes the output.
4) Recovery policy
A bounded retry/repair flow with explicit escalation.
If you do only one thing: treat every agent output like an API response.
Drop-in standard: the Strict JSON Contract (copy/paste)
1) System instruction template (for any agent)
ROLE:
You are <AGENT_NAME>. Produce output that downstream code can parse deterministically.
NON-NEGOTIABLE OUTPUT CONTRACT:
- Return ONLY valid JSON.
- Output MUST match the schema exactly (keys, types, allowed values).
- Do NOT include markdown, backticks, comments, or extra text.
- Do NOT add new keys.
- If you lack sufficient info, return status="unknown" with a reason.
VALIDATION AWARENESS:
Your output will be validated strictly.
If invalid, you will be asked to correct it using the validation errors.
2) Minimal wrapper schema (recommended across agents)
This wrapper makes your whole system easier to reason about:
- status: "ok" | "unknown" | "error"
- data: object (only when status="ok")
- error: object (only when status="unknown" or "error")
{
"type": "object",
"required": ["status"],
"additionalProperties": false,
"properties": {
"status": { "type": "string", "enum": ["ok", "unknown", "error"] },
"data": { "type": "object", "additionalProperties": true },
"error": {
"type": "object",
"required": ["code", "message"],
"additionalProperties": false,
"properties": {
"code": { "type": "string" },
"message": { "type": "string" }
}
}
}
}
3) Router output schema (the one that stabilizes your graph)
{
"type": "object",
"required": ["status", "route", "reason", "missing_inputs"],
"additionalProperties": false,
"properties": {
"status": { "type": "string", "enum": ["ok", "unknown"] },
"route": {
"type": "string",
"enum": ["ANSWER", "RESEARCH", "TOOL_CALL", "PLAN", "ESCALATE"]
},
"reason": { "type": "string" },
"missing_inputs": { "type": "array", "items": { "type": "string" } }
}
}
Example 1: “Almost JSON” (classic drift)
Bad output
Sure! { "status": "ok", "route": "PLAN", }
Why it fails
- Extra text before JSON
- Trailing comma
Good output
{
"status": "ok",
"route": "PLAN",
"reason": "User requested a multi-step build; needs task breakdown before execution.",
"missing_inputs": []
}
Example 2: Type drift that breaks downstream code
Your planner expects tasks: array, model returns a string.
Bad output
{
"status": "ok",
"data": {
"tasks": "1) define schema 2) add validator 3) write tests"
}
}
Good output
{
"status": "ok",
"data": {
"tasks": [
"Define JSON schema for planner output",
"Add strict validator gate before execution",
"Add retry policy with bounded attempts",
"Create eval set for malformed outputs"
]
}
}
Automation opportunities: what can be templated safely
Once you commit to strict schemas, a lot of “clean architecture work” becomes repeatable.
That’s not a bad thing. That’s a gift.
Here’s what you can safely template/automate:
- Schema generators for common agent roles (router/planner/validator)
- A reusable SchemaGate component (validate-before-handoff)
- Standard repair prompts (“Fix JSON using these validation errors”)
- A bounded retry policy (2 attempts → escalate)
- Automatic logging + eval capture (invalid output + validator errors + retry outcome)
- Schema versioning (schema_version) to keep the system evolving cleanl y
This is the stuff that doesn’t require “deep skills”… but it eats hours if you do it manually.
HuTouch for Work2.0:
HuTouch bakes in schemas, validation gates, retries, and clean logs by default, so your agents stop “kinda following the format” and start behaving like real systems.
Less babysitting JSON drift. More shipping production-grade multi-agent flows.
Not only automate your boring, but also get back 40% of your time to create value, that's Work2.0.
If you’re building agents and you’re done losing hours to “why did the model return this”…
Join early access for HuTouch
Quick checklist (printable)
- Every agent output has a schema (required keys + types + enums)
- Use additionalProperties: false for strictness
- Instructions say ONLY JSON (no prose, no markdown, no extras)
- Validation happens between agents, not “at the end”
- Recovery is bounded (e.g., 2 retries, then escalate)
- “Unknown” is a first-class status, not an afterthought
- Tool arguments are schema-validated too
- Invalid outputs are logged for evals + prompt fixes
- Schemas are versioned as the system grows
Top comments (0)