LLMs are powerful… until you ask them to return JSON.
Suddenly, you're staring at this:
{ "name": "John", "age": 30, "email": "john@example.com"
Missing bracket. Invalid JSON. Pipeline crashes.
Sound familiar? Let's break down why this happens and how you can actually fix it.
1. Why "JSON Mode" Isn't Enough
Some LLMs (like OpenAI) let you request response_format="json"
. Sounds perfect, right?
Except:
- Models still hallucinate comments or stray text.
- Outputs truncate mid-string when token limits hit.
- Complex schemas (nested dicts, lists) still break.
"Guaranteed JSON" isn't guaranteed.
2. The Hidden Costs
Most devs end up writing custom validators:
- Regex hacks to strip trailing commas
- Manual try/except loops with
json.loads
- Silent
None
returns that mask real errors
You spend hours debugging glue code instead of shipping features.
3. The Real Fix: Schema Validation + Retries
Instead of hoping the model behaves, enforce a schema and automatically retry malformed outputs.
Here's how with Agent Validator:
from agent_validator import validate, Schema, ValidationMode, ValidationError
schema = Schema({"name": str, "age": int, "email": str})
try:
result = validate(
agent_output,
schema,
retries=2,
mode=ValidationMode.COERCE
)
print("✅", result)
except ValidationError as e:
print("❌ Validation failed:", e)
- STRICT mode → exact type/shape match
-
COERCE mode → safe conversions (
"42"
→ 42,"true"
→ True) - Automatic retries with exponential backoff
- Local logs for debugging
4. Observability Included
Every attempt is logged to ~/.agent_validator/logs/
with correlation IDs.
No more "why did this break?" at 2 AM.
Need monitoring? Turn on cloud logging:
export AGENT_VALIDATOR_LOG_TO_CLOUD=1
and you get a dashboard of validation attempts.
5. TL;DR
- JSON mode ≠ reliable JSON
- Stop hacking validators by hand
- Use a schema + retries → your pipeline won't break on malformed outputs
👉 Try it:
pip install agent-validator
Top comments (0)