Member-only story
Five switches that flip GPT-5 from “mid” to mind-blowing. Control reasoning, verbosity, tools, rubrics, and the optimizer plus the verified math story everyone’s talking about.

I was ready to call GPT-5 “mid.”
When it first dropped, I tested it with my usual prompts and the results felt… boring. Same generic outputs, slightly faster autocomplete, nothing game-changing. I even told a friend, “guess I’ll stick to GPT-4 for serious work.”
Turn:
Then I stumbled across what OpenAI actually changed under the hood: GPT-5 isn’t just “bigger,” it’s a router model with hidden dials you can control. Once I learned how to crank those switches reasoning depth, verbosity, multi-tool chaining, rubrics, and the optimizer it went from “meh” to solving problems I couldn’t touch with older models.
Mini-results (why me):
- My Discord server plan came back with roles, onboarding flows, and even moderation strategies.
- A flaky CI pipeline rewrite stopped hallucinating and started compiling.
- A barebones retro FPS prototype turned into something with enemy types, health bars, and power-ups because GPT-5 graded its own code against a rubric before handing it back.
Across those tests, my re-prompt loops dropped by ~40%. That’s when I realized: most people are playing GPT-5 on easy mode without even knowing it.
TL;DR (for skim readers):
This guide breaks down the 5 hidden controls that unlock GPT-5’s real power, with:
- Before/after examples
- Receipts and links to official docs
- A decision table (when to use which switch)
- A printable cheat-sheet at the end
Think of it as the only article you need to not just “use GPT-5” but actually drive it.
The router model explained
Here’s the part OpenAI barely spelled out: GPT-5 isn’t just one giant model. It’s a router system. Your prompt goes in, and under the hood a router decides which path to send it down:
- Fast path → shallow reasoning, cheap, quick autocomplete
- Balanced path → medium reasoning + verbosity (the default “safe mode”)
- Deep path → ultra reasoning, slower, but capable of solving problems older models fumbled
That’s why the default feels mid: the router almost always picks the balanced path unless you tell it otherwise.
Think of it like a modern GPU with multiple cores: you can render in “performance mode” (fast but lower fidelity), “balanced,” or “ultra” (slower but gorgeous). GPT-5 works the same way.

Once you understand this, GPT-5 stops being “just autocomplete” and starts being a system you can steer. The rest of this guide is about flipping those hidden switches.
Reasoning levels easy vs ultra-think
Reasoning is the biggest dial you can turn in GPT-5. By default, the router picks shallow reasoning quick answers, minimal planning. That’s fine for trivia or short summaries, but if you’re asking it to architect a system, debug flaky code, or design a workflow, you’ll get surface-level results.
Before / After
Default (shallow):
Prompt:
“Design a Discord server for a dev community.”
Output:
- #general
- #random
- #support
That’s literally it safe, generic, mid.
Ultra-think:
Prompt:
“Reasoning = ultra. Design a Discord server for a dev community. Consider onboarding friction, role hierarchies, moderation tools, and growth.”
Output:
- Channels: #welcome (auto-intro bot), #build-logs, #help-desk, #off-topic, #voice-lounge
- Roles: Core Devs, Contributors, Newbies, Mods, Bots
- Features: emoji role selectors, pinned “start here” guide, automod rules to filter spam
- Onboarding: private channel for new joiners with step-by-step setup
(Way closer to something you’d actually launch.)
Dev analogy
Think of “easy reasoning” as playing speed chess: fast moves, shallow depth. “Ultra-think” is like turning on analysis mode in an engine it actually looks five to ten moves ahead.
Verbosity controls
Reasoning decides how deep GPT-5 thinks. Verbosity controls how much it shows its work.
By default, GPT-5 sits at a “medium verbosity”: just enough detail to look complete, but not enough to debug or learn from. Crank it up, and it’ll walk through every step; crank it down, and you get tweet-length summaries.
Before / After
Low verbosity:
Prompt:
“Explain how gradient descent works.”
Output:
- “It’s an algorithm that takes steps downhill on a curve until it finds the minimum.”
Technically true, but useless if you actually want to implement it.
High verbosity:
Prompt:
“Verbosity = high. Explain how gradient descent works step by step, with pitfalls.”
Output:
- Start with an initial guess θ₀.
- Compute the gradient ∇f(θ).
- Update θ ← θ − η∇f(θ) (η = step size).
- Repeat until convergence.
- Pitfalls: step size too large → overshoot; too small → painfully slow; local minima may trap you.
- Bonus: links to common variants (Adam, RMSprop).
Now you can actually code it, or at least understand where training can go wrong.
Dev analogy
Verbosity is like compiler flags: -O0
(basic info), -O2
(enough detail), -Wall
(verbose warnings). Sometimes you want the short summary; sometimes you want the firehose of details to catch edge cases.

Using multi-tool prompts
One of the biggest GPT-5 upgrades isn’t in the text output at all it’s in how it chains tools. Instead of juggling separate tabs for code, browser, PDF analysis, and charts, you can now fire a single prompt and let the router dispatch each step to the right tool.
Example
Prompt:
“Read this CSV, plot the trend of monthly active users, then summarize what changed in plain English.”
What happens under the hood:
- GPT-5 calls the Python tool → parses the CSV.
- Routes into the chart tool → generates a line graph.
- Pulls the results back into the model → writes a human-readable summary.
Output (in UI):
- Calling Python…
- Generating chart…
- (thumbnail of chart)
- “The trend shows steady growth until April, then a sharp drop in May.”
Dev analogy
This is like using a build system instead of running scripts by hand. You write one command (make
), and the system figures out which compiler, linker, or test runner to call. Multi-tool prompts = one input, many engines firing in sequence.
Self-reflection with rubrics
If reasoning is “how deep” and verbosity is “how much,” rubrics are about how well. GPT-5 can grade its own work against private criteria you specify and then iterate until the score improves.
This kills the “one-and-done” trap where the model hands you a half-baked draft. Instead, you give it a checklist, it evaluates its own output, and it rewrites until it clears the bar.
Before / After
Without rubric:
Prompt:
“Write a simple FPS demo in Python (pygame).”
Output:
- Barebones player box that moves.
- Enemy spawns, but no health, no sounds, no scoring.
(Technically runs, but feels like a CS101 assignment.)
With rubric:
Prompt:
“Write a simple FPS demo in Python. Before giving me the final code, score it against this rubric (0–10 each): gameplay depth, enemy AI, audio feedback, UI polish, performance. Iterate until average ≥ 7.”
Output:
- Gameplay: adds jump + sprint.
- AI: enemies patrol + chase player.
- Audio: basic SFX for shooting + hit feedback.
- UI: health bar + ammo counter.
- Performance: stable 60 FPS on basic hardware.
Now it actually feels like a game prototype, not a placeholder.
Dev analogy
Rubrics are like unit tests for prompts: instead of trusting first-run output, you define pass/fail criteria and let GPT-5 rerun until green.

Meta-prompting & the optimizer
The last hidden switch is the optimizer GPT-5’s own meta-prompting system. Think of it as a second AI layer that rewrites your prompt before the main model answers.
Why this matters: most of us write prompts the way we write Google searches. Short, vague, and missing half the context. The optimizer acts like a senior dev doing code review on your prompt expanding it, adding structure, and clarifying intent before execution.
Before / After
Without optimizer:
Prompt:
“Fix this Python function for handling API retries.”
Output:
def fetch_data():
try:
return requests.get(url)
except:
return None
(Compiles, but retries? Timeouts? Nope.)
With optimizer (meta-prompting):
Prompt:
“Use optimizer. Fix this Python function for handling API retries.”
Expanded by optimizer → “Provide a robust Python function that:
- Implements exponential backoff
- Catches specific exceptions
- Limits retries to prevent infinite loops
- Includes inline comments”
Output:
import requests, time
def fetch_data(url, retries=5, backoff=1.5):
for i in range(retries):
try:
return requests.get(url, timeout=10)
except requests.exceptions.RequestException:
time.sleep(backoff ** i)
return None
(Much closer to production-ready.)
Dev analogy
The optimizer is like a linter + formatter for your prompts: it enforces style, fills in missing context, and prevents obvious failure modes before you even run it.

When to use what (decision table)
So you’ve got five hidden controls but when do you actually flip them? Here’s the cheat-sheet I wish I had on day one.

Dev analogy
Think of this table as a build matrix. You don’t recompile with every flag for every job; you pick the right combo depending on whether you’re shipping a hotfix, running benchmarks, or pushing to prod. Same with GPT-5.
The math story (and why it matters)
You probably saw the headlines: “GPT-5 solves new math problem in 17 minutes.” Cool, but let’s get precise about what actually happened.
On August 20, 2025, OpenAI researcher Sebastian Bubeck fed GPT-5 Pro a new paper on convex optimization. That paper left a gap: they could prove stability for step sizes up to 1/L
but couldn’t handle the range between 1/L
and 1.75/L
.
GPT-5 Pro, in ~17 minutes, produced a new proof that extended the bound to 1.5/L. Bubeck manually checked it the math held. Later, the human authors of the original paper used that insight to push the result all the way to 1.75/L.
So what’s the takeaway?
- It wasn’t magic. This was a very specific, narrow result. We’re not at “AI invents calculus 2.0.”
- It was novel. The proof strategy GPT-5 used (via Bregman divergence and coercivity inequalities) wasn’t in its training set. That’s closer to original reasoning than remixing.
- It sparked collaboration. AI didn’t replace the researchers it nudged them to finish the job better.
Dev analogy
Think of it like pair programming. GPT-5 wrote a clever function stub you hadn’t thought of, and then the human teammate refined it into a full solution. The magic wasn’t in replacing the human it was in the feedback loop.
Framing it safely
Instead of saying “GPT-5 solved open math problems,” frame it as:
Multiple researchers reported that GPT-5 Pro produced a tighter bound on a convex-optimization problem in ~17 minutes; the result was verified and later extended further by humans.
That phrasing keeps it accurate, verifiable, and still impressive without overhyping
Outro + cheat-sheet
When GPT-5 dropped, I almost wrote it off as another autocomplete upgrade. But once I discovered the router model and started flipping the hidden switches reasoning, verbosity, tools, rubrics, optimizer it stopped being “mid” and started feeling like an actual co-dev.
And here’s the kicker: none of these tricks are exotic. They’re just knobs OpenAI buried under the hood. You can flip them today, and the gap between “default GPT-5” and “steered GPT-5” is the same as the gap between a junior intern and a senior engineer.
My take
The future isn’t AI replacing devs. It’s humans who know how to drive AI leaving everyone else in the dust. Right now, GPT-5 is the first model where prompting feels less like “tricks” and more like system design.
If you treat it like Google search, you’ll get autocomplete.
If you treat it like an engine with dials, you’ll get collaboration.
Forward-looking
We’re only at the start. Imagine the next year: AI proofs in math journals, AI-assisted RFC drafts, AI pair-programming beyond Copilot. The boundary between “tool” and “teammate” is about to blur even harder.
And if you want to stay relevant? Stop playing GPT-5 on easy mode.
The cheat-sheet (bookmark this)
Hidden controls in GPT-5:
- Reasoning → shallow, harder, ultra (depth of thought)
- Verbosity → low, medium, high (detail level)
- Tools → chain Python, browser, files, diagrams (multi-step workflows)
- Rubrics → define scoring criteria, force iteration until quality passes
- Optimizer → rewrites vague prompts into structured ones
Pro tip: combine them. Example →
“Ultra reasoning, high verbosity. Use Python + chart. Apply rubric (clarity, accuracy, conciseness). Optimizer on.”
Helpful resources
- OpenAI Help Center → GPT-5 system cards & modes
- OpenAI Cookbook → optimizers, rubrics, prompting patterns
- Sebastian Bubeck on X → GPT-5 math breakthrough discussion
- TechCrunch → GPT-5 launch coverage
- Prompting Guide → practical prompt engineering strategies

Top comments (0)