OpenAI’s GPT-5 livestream was an hour long, packed with demos and model comparisons. But somewhere around the 44-minute mark, three updates were mentioned so casually you could have blinked and missed them.
If you’re building Agents, though, these are not small tweaks — they’re weapons-grade upgrades. They hit the exact pain points that slow us down: tool calls that feel like black boxes, payloads broken by escape-sequence chaos, and output formats that never quite stay in line.
Here’s why these “80 seconds of airtime” might be the most important thing in GPT-5 for agent developers.
In OpenAI’s 1-hour GPT-5 livestream, there’s an 80-second stretch (44:10–45:30) where Michelle Pokrass drops three new features so casually you could blink and miss them.
If you’ve been in the Agent development trenches for a while, you know these aren’t “nice-to-have” tweaks — they’re quiet depth charges planted right under the Agent ecosystem.
They’re not built for showy demos.
They go straight for the pain points we wrestle with every day:
- Tool calls that feel like a black box
- Escape-sequence chaos that explodes at the worst possible moment
- Output formats that change on a whim
TL;DR
Tool Preambles
Prepend a short, user-friendly explanation before the model calls a tool — improving transparency without extra hacks. Works even with tool_choice="required"
.
system_prompt = "When providing weather, use the proper unit. But don't ask user for it, make a best guess. \
Always add tool call preambles to explain your actions in advance."
oai_response = oai_client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "What's the weather like in New York?"},
],
tool_choice="required", # <--- Note here
tools=[oai_get_weather],
)
print_response(oai_response)
# Output:
# [Preamble] I’m going to check the current weather for New York using our weather service, using Fahrenheit since New York is in the United States.
# get_weather({"location":"New York","units":"fahrenheit"})
Custom Tools
Define a custom_tool
that takes free-form text input — no more JSON escape-sequence nightmares.
json_string = json.dumps({"data": json.dumps({"profile": json.dumps({"name": "\"Joe\\Doe\""})})})
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
response = oai_client.responses.create(
model="gpt-5",
input=f"Deserialize this json string: {json_string}",
tools=[
{
"type": "custom",
"name": "magic_deserializer",
"description": "A magic deserializer for heavily nested JSON",
"format": {"type": "text"},
}
],
reasoning={"effort": "minimal"},
)
print_response(response)
# Output
# Tool: magic_deserializer
# Input:
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
CFG (Context-Free Grammar)
Lock model output to a strict format with CFG — a 100% guarantee your outputs conform.
sql_prompt_mssql = (
"Call the mssql_grammar to generate a query for Microsoft SQL Server that retrieve the "
"five most recent orders per customer, showing customer_id, order_id, order_date, and total_amount, "
"where total_amount > 500 and order_date is after '2025-01-01'. "
)
response_mssql = oai_client.responses.create(
model="gpt-5",
input=sql_prompt_mssql,
text={"format": {"type": "text"}},
tools=[
{
"type": "custom",
"name": "mssql_grammar",
"description": (
"Executes read-only Microsoft SQL Server queries limited to SELECT statements with TOP and basic WHERE/ORDER BY. "
"YOU MUST REASON HEAVILY ABOUT THE QUERY AND MAKE SURE IT OBEYS THE GRAMMAR."
),
"format": {
"type": "grammar", # <--- Here
"syntax": "lark",
"definition": mssql_grammar,
},
},
],
parallel_tool_calls=False,
)
print_response(response_mssql)
# Output
# Tool: mssql_grammar
# Input:
# SELECT TOP 5 customer_id, order_id, order_date, total_amount FROM orders WHERE total_amount > 500 AND order_date > '2025-01-01' ORDER BY order_date DESC;
1. Tool Preambles — Transparency Without Hacks
If you’ve integrated tool calls before, you’ve probably noticed:
Claude models often preface a tool call with a human-readable explanation (a “preamble”), while GPT models, historically, just fire off the call silently.
Claude example:
ant_response = ant_client.messages.create(
model="claude-3-5-sonnet-latest",
system="When providing weather, use the proper unit. But don't ask user for it, make a best guess.",
max_tokens=500,
tools=[ant_get_weather],
messages=[{"role": "user", "content": "What's the weather like in New York?"}],
)
print_response(ant_response)
# Output:
# [Preamble] I'll check the current weather in New York. Since this appears to be a US-based request, I'll use Fahrenheit units as that's most common in the United States.
# get_weather({"location": "New York", "units": "fahrenheit"})
GPT without proper prompting:
oai_response = oai_client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": "When providing weather, use the proper unit. But don't ask user for it, make a best guess."},
{"role": "user", "content": "What's the weather like in New York?"},
],
tools=[oai_get_weather],
)
print_response(oai_response)
# Output:
# [Preamble] None
# get_weather({"location":"New York","units":"fahrenheit"})
Now you can add prompts to make the model explain its actions first. While pre-GPT-5 models could technically do this, GPT-5 delivers far more stable performance (based on my hands-on experience). GPT-5 explains itself even when forced to use tools (tool_choice="required"
). Previous models (including Claude) would skip the preamble in this scenario.
system_prompt = "When providing weather, use the proper unit. But don't ask user for it, make a best guess. \
Always add tool call preambles to explain your actions in advance."
oai_response = oai_client.chat.completions.create(
model="gpt-5",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "What's the weather like in New York?"},
],
tool_choice="required", # <--- Note here
tools=[oai_get_weather],
)
print_response(oai_response)
# Output:
# [Preamble] I’m going to check the current weather for New York using our weather service, using Fahrenheit since New York is in the United States.
# get_weather({"location":"New York","units":"fahrenheit"})
2. Custom Tools — Finally, No More Escape Sequence Nightmares
Anyone who’s built agents knows:
JSON function calling is great… until you have \\\\
in your payload. Then it’s a coin flip whether the model adds or drops a backslash.
We’ve all resorted to YAML outputs, fenced code blocks, or hand-rolled parsers. Now? You can simply declare a tool with "format": {"type": "text"}
and let GPT-5 pass free-form input directly — no escaping required.
json_string = json.dumps({"data": json.dumps({"profile": json.dumps({"name": "\"Joe\\Doe\""})})})
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
response = oai_client.responses.create(
model="gpt-5",
input=f"Deserialize this json string: {json_string}",
tools=[
{
"type": "custom", # <-- Here
"format": {"type": "text"}, # <-- Here
"name": "magic_deserializer",
"description": "A magic deserializer for heavily nested JSON",
}
],
reasoning={"effort": "minimal"},
)
print_response(response)
# Output
# Tool: magic_deserializer
# Input:
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
Now you can ditch custom parsers entirely and enjoy a unified tool-calling interface. But combine this with the next feature, and we're entering next-level territory!
3. CFG — Turning “Format by Prompting” into “Format by Law”
Before GPT-4, many of us loved open-source models for one big reason: you could hard-enforce output formats during decoding using tools like Guidance, lm-format-enforcer, or Outlines.These tools requires access to the output logits of language models and dynamically changes logit biases, making them unusable with GPTs.
Now, GPT-5 bakes that enforcement in with LLGuidance under the hood to constrain model sampling!
You can attach a Lark or regex grammar to a custom tool, and the model will only emit valid outputs according to that grammar. No retries. No post-processing clean-ups.
Take the official example - it defines an MS SQL grammar where the key start
rule guarantees every query will:
- Always open with SELECT TOP
- Require FROM, WHERE and ORDER BY clauses
# ----------------- grammars for MS SQL dialect -----------------
mssql_grammar = r"""\
// ---------- Punctuation & operators ----------
SP: " "
COMMA: ","
GT: ">"
EQ: "="
SEMI: ";"
// ---------- Start ----------
start: "SELECT" SP "TOP" SP NUMBER SP select_list SP "FROM" SP table SP "WHERE" SP amount_filter SP "AND" SP date_filter SP "ORDER" SP "BY" SP sort_cols SEMI
// ---------- Projections ----------
select_list: column (COMMA SP column)*
column: IDENTIFIER
// ---------- Tables ----------
table: IDENTIFIER
// ---------- Filters ----------
amount_filter: "total_amount" SP GT SP NUMBER
date_filter: "order_date" SP GT SP DATE
// ---------- Sorting ----------
sort_cols: "order_date" SP "DESC"
// ---------- Terminals ----------
IDENTIFIER: /[A-Za-z_][A-Za-z0-9_]*/
NUMBER: /[0-9]+/
DATE: /'[0-9]{4}-[0-9]{2}-[0-9]{2}'/
"""
sql_prompt_mssql = (
"Call the mssql_grammar to generate a query for Microsoft SQL Server that retrieve the "
"five most recent orders per customer, showing customer_id, order_id, order_date, and total_amount, "
"where total_amount > 500 and order_date is after '2025-01-01'. "
)
response_mssql = oai_client.responses.create(
model="gpt-5",
input=sql_prompt_mssql,
text={"format": {"type": "text"}},
tools=[
{
"type": "custom",
"name": "mssql_grammar",
"description": (
"Executes read-only Microsoft SQL Server queries limited to SELECT statements with TOP and basic WHERE/ORDER BY. "
"YOU MUST REASON HEAVILY ABOUT THE QUERY AND MAKE SURE IT OBEYS THE GRAMMAR."
),
"format": {
# "type": "text",
"type": "grammar", # <--- Note Here
"syntax": "lark",
"definition": mssql_grammar,
},
},
],
parallel_tool_calls=False,
)
print_response(response_mssql)
# Output
# Tool: mssql_grammar
# Input:
# SELECT TOP 5 customer_id, order_id, order_date, total_amount FROM orders WHERE total_amount > 500 AND order_date > '2025-01-01' ORDER BY order_date DESC;
Why This Matters
Individually, they might look like small, unrelated updates.
Together, they signal a deeper shift in OpenAI’s Agent design philosophy:
- From “usable” to “trustable” — Preambles bring explainability, CFG delivers reliability
- From “developer adapts to the model” to “model adapts to the developer” — Custom Tools handle complex inputs without contortions
- From “prompt-engineering folklore” to “engineering discipline” — CFG turns soft guidelines into hard guarantees
The Real Takeaway
GPT-5’s benchmark charts will grab all the headlines.
But for those of us building Agents?
These blink-and-you-miss-them updates might be the biggest leap forward in years.
And when you line them up alongside OpenAI Harmony, the potential compounds — especially for anyone working on LLM post-training or structured generation. That’s a topic big enough to deserve its own deep dive — and that’s exactly what I’ll cover in my next article.
Top comments (0)