DEV Community

Catuai
Catuai

Posted on

GPT-5's Quiet Upgrades That Supercharge Agent Development

OpenAI’s GPT-5 livestream was an hour long, packed with demos and model comparisons. But somewhere around the 44-minute mark, three updates were mentioned so casually you could have blinked and missed them.

If you’re building Agents, though, these are not small tweaks — they’re weapons-grade upgrades. They hit the exact pain points that slow us down: tool calls that feel like black boxes, payloads broken by escape-sequence chaos, and output formats that never quite stay in line.

Here’s why these “80 seconds of airtime” might be the most important thing in GPT-5 for agent developers.


In OpenAI’s 1-hour GPT-5 livestream, there’s an 80-second stretch (44:10–45:30) where Michelle Pokrass drops three new features so casually you could blink and miss them.

If you’ve been in the Agent development trenches for a while, you know these aren’t “nice-to-have” tweaks — they’re quiet depth charges planted right under the Agent ecosystem.

They’re not built for showy demos.
They go straight for the pain points we wrestle with every day:

  • Tool calls that feel like a black box
  • Escape-sequence chaos that explodes at the worst possible moment
  • Output formats that change on a whim

TL;DR

Tool Preambles

Docs

Prepend a short, user-friendly explanation before the model calls a tool — improving transparency without extra hacks. Works even with tool_choice="required".

system_prompt = "When providing weather, use the proper unit. But don't ask user for it, make a best guess. \
Always add tool call preambles to explain your actions in advance."

oai_response = oai_client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What's the weather like in New York?"},
    ],
    tool_choice="required",  # <--- Note here
    tools=[oai_get_weather],
)
print_response(oai_response)

# Output:
# [Preamble] I’m going to check the current weather for New York using our weather service, using Fahrenheit since New York is in the United States.
# get_weather({"location":"New York","units":"fahrenheit"})
Enter fullscreen mode Exit fullscreen mode

Custom Tools

Docs

Define a custom_tool that takes free-form text input — no more JSON escape-sequence nightmares.

json_string = json.dumps({"data": json.dumps({"profile": json.dumps({"name": "\"Joe\\Doe\""})})})
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
response = oai_client.responses.create(
    model="gpt-5",
    input=f"Deserialize this json string: {json_string}",
    tools=[
        {
            "type": "custom",
            "name": "magic_deserializer",
            "description": "A magic deserializer for heavily nested JSON",
            "format": {"type": "text"},
        }
    ],
    reasoning={"effort": "minimal"},
)
print_response(response)
# Output
# Tool: magic_deserializer
# Input:
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
Enter fullscreen mode Exit fullscreen mode

CFG (Context-Free Grammar)

Docs

Lock model output to a strict format with CFG — a 100% guarantee your outputs conform.

sql_prompt_mssql = (
    "Call the mssql_grammar to generate a query for Microsoft SQL Server that retrieve the "
    "five most recent orders per customer, showing customer_id, order_id, order_date, and total_amount, "
    "where total_amount > 500 and order_date is after '2025-01-01'. "
)

response_mssql = oai_client.responses.create(
    model="gpt-5",
    input=sql_prompt_mssql,
    text={"format": {"type": "text"}},
    tools=[
        {
            "type": "custom",
            "name": "mssql_grammar",
            "description": (
                "Executes read-only Microsoft SQL Server queries limited to SELECT statements with TOP and basic WHERE/ORDER BY. "
                "YOU MUST REASON HEAVILY ABOUT THE QUERY AND MAKE SURE IT OBEYS THE GRAMMAR."
            ),
            "format": {
                "type": "grammar",  # <--- Here
                "syntax": "lark",
                "definition": mssql_grammar,
            },
        },
    ],
    parallel_tool_calls=False,
)

print_response(response_mssql)

# Output
# Tool: mssql_grammar
# Input:
# SELECT TOP 5 customer_id, order_id, order_date, total_amount FROM orders WHERE total_amount > 500 AND order_date > '2025-01-01' ORDER BY order_date DESC;
Enter fullscreen mode Exit fullscreen mode

1. Tool Preambles — Transparency Without Hacks

If you’ve integrated tool calls before, you’ve probably noticed:
Claude models often preface a tool call with a human-readable explanation (a “preamble”), while GPT models, historically, just fire off the call silently.

Claude example:

ant_response = ant_client.messages.create(
    model="claude-3-5-sonnet-latest",
    system="When providing weather, use the proper unit. But don't ask user for it, make a best guess.",
    max_tokens=500,
    tools=[ant_get_weather],
    messages=[{"role": "user", "content": "What's the weather like in New York?"}],
)
print_response(ant_response)

# Output:
# [Preamble] I'll check the current weather in New York. Since this appears to be a US-based request, I'll use Fahrenheit units as that's most common in the United States.
# get_weather({"location": "New York", "units": "fahrenheit"})
Enter fullscreen mode Exit fullscreen mode

GPT without proper prompting:

oai_response = oai_client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": "When providing weather, use the proper unit. But don't ask user for it, make a best guess."},
        {"role": "user", "content": "What's the weather like in New York?"},
    ],
    tools=[oai_get_weather],
)
print_response(oai_response)

# Output:
# [Preamble] None
# get_weather({"location":"New York","units":"fahrenheit"})
Enter fullscreen mode Exit fullscreen mode

Now you can add prompts to make the model explain its actions first. While pre-GPT-5 models could technically do this, GPT-5 delivers far more stable performance (based on my hands-on experience). GPT-5 explains itself even when forced to use tools (tool_choice="required"). Previous models (including Claude) would skip the preamble in this scenario.

system_prompt = "When providing weather, use the proper unit. But don't ask user for it, make a best guess. \
Always add tool call preambles to explain your actions in advance."

oai_response = oai_client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": "What's the weather like in New York?"},
    ],
    tool_choice="required",  # <--- Note here
    tools=[oai_get_weather],
)
print_response(oai_response)

# Output:
# [Preamble] I’m going to check the current weather for New York using our weather service, using Fahrenheit since New York is in the United States.
# get_weather({"location":"New York","units":"fahrenheit"})
Enter fullscreen mode Exit fullscreen mode

2. Custom Tools — Finally, No More Escape Sequence Nightmares

Anyone who’s built agents knows:
JSON function calling is great… until you have \\\\ in your payload. Then it’s a coin flip whether the model adds or drops a backslash.

We’ve all resorted to YAML outputs, fenced code blocks, or hand-rolled parsers. Now? You can simply declare a tool with "format": {"type": "text"} and let GPT-5 pass free-form input directly — no escaping required.

json_string = json.dumps({"data": json.dumps({"profile": json.dumps({"name": "\"Joe\\Doe\""})})})
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
response = oai_client.responses.create(
    model="gpt-5",
    input=f"Deserialize this json string: {json_string}",
    tools=[
        {
            "type": "custom",            # <-- Here
            "format": {"type": "text"},  # <-- Here
            "name": "magic_deserializer",
            "description": "A magic deserializer for heavily nested JSON",
        }
    ],
    reasoning={"effort": "minimal"},
)
print_response(response)

# Output
# Tool: magic_deserializer
# Input:
# {"data": "{\"profile\": \"{\\\"name\\\": \\\"\\\\\\\"Joe\\\\\\\\Doe\\\\\\\"\\\"}\"}"}
Enter fullscreen mode Exit fullscreen mode

Now you can ditch custom parsers entirely and enjoy a unified tool-calling interface. But combine this with the next feature, and we're entering next-level territory!


3. CFG — Turning “Format by Prompting” into “Format by Law”

Before GPT-4, many of us loved open-source models for one big reason: you could hard-enforce output formats during decoding using tools like Guidance, lm-format-enforcer, or Outlines.These tools requires access to the output logits of language models and dynamically changes logit biases, making them unusable with GPTs.

Now, GPT-5 bakes that enforcement in with LLGuidance under the hood to constrain model sampling!
You can attach a Lark or regex grammar to a custom tool, and the model will only emit valid outputs according to that grammar. No retries. No post-processing clean-ups.

Take the official example - it defines an MS SQL grammar where the key start rule guarantees every query will:

  • Always open with SELECT TOP
  • Require FROM, WHERE and ORDER BY clauses
# ----------------- grammars for MS SQL dialect -----------------
mssql_grammar = r"""\
// ---------- Punctuation & operators ----------
SP: " "
COMMA: ","
GT: ">"
EQ: "="
SEMI: ";"

// ---------- Start ----------
start: "SELECT" SP "TOP" SP NUMBER SP select_list SP "FROM" SP table SP "WHERE" SP amount_filter SP "AND" SP date_filter SP "ORDER" SP "BY" SP sort_cols SEMI

// ---------- Projections ----------
select_list: column (COMMA SP column)*
column: IDENTIFIER

// ---------- Tables ----------
table: IDENTIFIER

// ---------- Filters ----------
amount_filter: "total_amount" SP GT SP NUMBER
date_filter: "order_date" SP GT SP DATE

// ---------- Sorting ----------
sort_cols: "order_date" SP "DESC"

// ---------- Terminals ----------
IDENTIFIER: /[A-Za-z_][A-Za-z0-9_]*/
NUMBER: /[0-9]+/
DATE: /'[0-9]{4}-[0-9]{2}-[0-9]{2}'/
"""
Enter fullscreen mode Exit fullscreen mode
sql_prompt_mssql = (
    "Call the mssql_grammar to generate a query for Microsoft SQL Server that retrieve the "
    "five most recent orders per customer, showing customer_id, order_id, order_date, and total_amount, "
    "where total_amount > 500 and order_date is after '2025-01-01'. "
)

response_mssql = oai_client.responses.create(
    model="gpt-5",
    input=sql_prompt_mssql,
    text={"format": {"type": "text"}},
    tools=[
        {
            "type": "custom",
            "name": "mssql_grammar",
            "description": (
                "Executes read-only Microsoft SQL Server queries limited to SELECT statements with TOP and basic WHERE/ORDER BY. "
                "YOU MUST REASON HEAVILY ABOUT THE QUERY AND MAKE SURE IT OBEYS THE GRAMMAR."
            ),
            "format": {
                # "type": "text",
                "type": "grammar",  # <--- Note Here
                "syntax": "lark",
                "definition": mssql_grammar,
            },
        },
    ],
    parallel_tool_calls=False,
)

print_response(response_mssql)

# Output
# Tool: mssql_grammar
# Input:
# SELECT TOP 5 customer_id, order_id, order_date, total_amount FROM orders WHERE total_amount > 500 AND order_date > '2025-01-01' ORDER BY order_date DESC;
Enter fullscreen mode Exit fullscreen mode

Why This Matters

Individually, they might look like small, unrelated updates.
Together, they signal a deeper shift in OpenAI’s Agent design philosophy:

  • From “usable” to “trustable” — Preambles bring explainability, CFG delivers reliability
  • From “developer adapts to the model” to “model adapts to the developer” — Custom Tools handle complex inputs without contortions
  • From “prompt-engineering folklore” to “engineering discipline” — CFG turns soft guidelines into hard guarantees

The Real Takeaway

GPT-5’s benchmark charts will grab all the headlines.
But for those of us building Agents?

These blink-and-you-miss-them updates might be the biggest leap forward in years.

And when you line them up alongside OpenAI Harmony, the potential compounds — especially for anyone working on LLM post-training or structured generation. That’s a topic big enough to deserve its own deep dive — and that’s exactly what I’ll cover in my next article.

Top comments (0)