You ask an LLM for JSON. It wraps the response in
markers. Or it adds "Here's the JSON you requested:" before the actual data. Or it silently drops a required field.
If you're parsing LLM output with `json.loads()` inside a `try/except` — stop. There's a better way.
## The Code
```python
from openai import OpenAI
from pydantic import BaseModel
client = OpenAI()
class ExtractedContact(BaseModel):
name: str
email: str
company: str
role: str
is_decision_maker: bool
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "Extract contact information from the text."
},
{
"role": "user",
"content": """Hey! I'm Jamie Chen, lead engineer at Acme Corp.
You can reach me at jamie.chen@acmecorp.io.
I'm evaluating tools for our platform team."""
}
],
response_format=ExtractedContact,
)
contact = response.choices[0].message.parsed
print(contact.name) # Jamie Chen
print(contact.email) # jamie.chen@acmecorp.io
print(contact.company) # Acme Corp
print(contact.role) # Lead Engineer
print(contact.is_decision_maker) # True
print(type(contact)) # <class 'ExtractedContact'>
That's it. No regex. No json.loads(). No try/except. You get a typed Python object back, guaranteed to match your schema.
How It Works
Step 1: Define your schema as a Pydantic model. Each field has a name and a type. Pydantic validates the data at runtime — if a field is missing or the wrong type, it raises an error before your code ever sees bad data.
Step 2: Pass the model as response_format. When you use client.beta.chat.completions.parse() instead of the regular create(), OpenAI constrains the model's output to match your schema exactly. The API won't return a response that violates the structure.
Step 3: Access .parsed instead of .content. The response object gives you a fully hydrated Pydantic instance. You get autocomplete in your IDE, type checking, and direct attribute access.
Why This Matters for Agents
If you're building AI agents that call tools, structured output eliminates an entire class of bugs. Instead of hoping the LLM returns valid tool arguments, you guarantee it.
Here's a more practical example — a tool-calling pattern:
from pydantic import BaseModel, Field
from typing import Literal
class ToolCall(BaseModel):
tool_name: Literal["search", "calculate", "send_email"]
reason: str = Field(description="Why this tool was selected")
parameters: dict
response = client.beta.chat.completions.parse(
model="gpt-4o",
messages=[
{"role": "system", "content": "Decide which tool to call."},
{"role": "user", "content": "What's the weather in Tokyo?"}
],
response_format=ToolCall,
)
tool = response.choices[0].message.parsed
print(tool.tool_name) # search
print(tool.reason) # Need real-time weather data
print(tool.parameters) # {'query': 'weather Tokyo'}
The Literal type constraint means the LLM can only pick from your approved tool list. No hallucinated tool names. No typos. The Field(description=...) gives the model context about what each field means, improving accuracy.
Quick Tips
-
Add descriptions to fields using
Field(description="..."). The LLM reads these to understand what you want. -
Use
Literalfor enums. Constrain values to a fixed set instead of hoping the model picks from your list. -
Use
Optionalsparingly. If a field might not exist in the source data, mark itOptional[str] = None. But prefer required fields — they force the model to extract or infer a value. -
Nest models for complex schemas. Pydantic models can contain other Pydantic models:
contacts: list[ExtractedContact].
What About Other LLMs?
Anthropic's Claude supports a similar pattern via tool_use with JSON schemas. Google's Gemini has response_schema in its API. The Pydantic model approach works across providers — you define the schema once and adapt the API call.
The pattern is the same everywhere: define your structure, tell the model to conform to it, get typed data back.
Next Steps
If you're building agents that chain multiple tool calls, structured output is the foundation. Pair it with proper tool design — if you missed it, check out How to Build a Custom MCP Tool in Under 10 Min for the other half of the equation.
Top comments (0)