Łukasz Blania

Posted on Sep 7

ChatGPT-5 Is Amazing. You don’t know how to use this properly. I show you why.

#chatgpt #promptengineering #openai #gpt5

Let’s talk like humans for a second. You typed a vague thought into a text box, hit Enter, and expected cinematic genius to roll out of the machine like a red carpet. Instead, you got something… meh. You sighed. “ChatGPT-5 is worse than 4o!” you posted, shaking your fist at the cloud.

I’ve got news that will both empower you and mildly annoy you: GPT-5 is better than GPT-4o for most real work — agents, tool use, coding, long context, instruction following — but it punishes sloppy prompts more than its predecessors. The model is more steerable, more systematic, and far more literal. If you tell it where to go, it’ll drive. If you say “take me somewhere nice,” don’t be surprised when you end up in a parking lot with decent lighting.

This article is a friendly deep-dive into how to get consistently great results with GPT-5. We’ll quickly cover what changed from the GPT-4o era, why your results might feel bland, how the new modes (Instant, Auto, Thinking, and Pro) actually work, and a handful of practical snippets — straight from the playbook of people who use GPT-5 all day — to help you write prompts that don’t waste tokens or time.

Grab a coffee. I’ll keep it conversational, a little dry-humored, and very practical.

What Changed: From a Zoo of Models to One Agentic System

In the GPT-4o days, you often had to choose your fighter — a model for speed, another for multimodal, another for reasoning, and so on. GPT-5 simplifies that mental overhead. Think of it less like “a model” and more like an agentic engine with multiple gears:

A unified core (one flagship model) that routes internally.
Modes that change how hard it thinks and how verbose it speaks.
Agent-first behavior: better at calling tools, following rules, and persisting plans across steps.
Stronger instruction adherence: if your instructions conflict or are vague, GPT-5 will spend effort reconciling that mess instead of producing gold from chaos.

The practical effect is: less menu anxiety, more steering power. But you have to actually steer.

“My Results Are Terrible” — Why That Happens with GPT-5

Let’s get the uncomfortable bit out of the way: GPT-5 is allergic to ambiguity. GPT-4o was often forgiving. You could toss a fuzzy prompt at it and it would do a decent job guessing your intent. GPT-5 can still guess — but it’s trained to respect your constraints and follow your process when you give it one. When you don’t, it can over-invest in clarification or under-deliver because the target was never pinned down.

Here are the three most common causes of “bad answers” with GPT-5:

Vague goals
- “Write a plan” is not a goal.
- “Draft a 7-step launch plan for a two-person SaaS, each step ≤120 words, with a simple checklist and a 14-day timeline” is a goal.
Contradictory rules
- “Never ask the user to confirm anything” vs. “Always obtain explicit consent before booking” is the kind of conflict that sends GPT-5 into polite paralysis. It tries to satisfy both. Resolve conflicts yourself.
Missing stop conditions
- Agents need to know when to stop. If you don’t define finish lines, they’ll keep circling the track — or stop too early. You decide the lap count.

The fix: be explicit about what you want, how you want it, how far the model should explore, and when it’s allowed to proceed despite uncertainty.

The Modes: Instant, Auto, Thinking, Pro — What They Actually Do

GPT-5 exposes four working styles. These aren’t just “speed settings”; they’re behavior contracts. Use the right one for the job, and your life gets easier.

1) Instant — “Answer fast, don’t overthink it”

Instant mode is the one that answers as quickly as possible, often sacrificing depth and nuance to give you something fast and usable. It shines when the task is simple — like rewriting a sentence, summarizing a short passage, or giving a factual answer — but it won’t bother with complex reasoning or edge cases. If you ask it to solve a real problem, it will simply hand you the quickest version of an answer and move on.

When to use:

Quick summaries, short emails, simple rewrites, straightforward Q&A.
Latency-sensitive interfaces (chat widgets, quick drafts).
You already know what you want; you just need it formatted or rephrased.

Strengths:

Low latency, low cost.
Great when the task is clear and bounded.

Pitfalls:

Doesn’t wander far. If the task is fuzzy, Instant will give you a tidy version of your fuzziness.
Minimal context gathering; it won’t go spelunking unless told.

How to prompt for Instant:
Why this works: you’ve set a tiny budget, a clear finish line, and a crisp format. Instant thrives on constraints.

Search depth: very low
Absolute max tool calls: 2
Proceed with the best available answer even if not fully certain.
Stop when output meets the format and constraints below.

Audience: startup founders, non-technical
Length: ≤ 180 words
Format: 3 bullets + 1-sentence CTA
Tone: concise, friendly, no fluff

2) Auto — “Pick the right depth for me”

Auto mode is the balanced choice. It adapts to the difficulty of the task, behaving like Instant when things are straightforward, but taking more time and reasoning when the problem requires it. This makes it the most practical mode for everyday use, because you don’t have to decide how much thinking the model should do — it figures it out on its own. The catch is that if your instructions are vague, Auto might overanalyze and spend too much time gathering context before giving you what you want.

When to use:

Most day-to-day tasks where you don’t want to micromanage.
The model can decide whether to think shallow or deep.
You’re okay with it calling tools and making a plan if needed.

Strengths:

Balanced. Often the best default.
Can switch gears mid-task: skim where trivial, dig where tricky.

Pitfalls:

If your instructions are vague, Auto might “gather context” longer than you expected.
If your rules conflict, Auto burns cycles reconciling them.

How to prompt for Auto (calibrated):

Goal: get just enough context to act.
Method:

Start broad, then one focused batch of subqueries.
Deduplicate; cache; avoid repeated searches. Early stop:
You can name exact items to change or produce. Escalate once:
If signals conflict, do one refined batch, then act. Keep going until all subtasks in the plan are done. Do not hand back on uncertainty—proceed with the most reasonable assumption, and document it in the summary.

3) Thinking — “Take your time, reason deeply, chain steps”

Thinking mode is where GPT-5 goes deep. Instead of rushing, it breaks the problem apart, makes a plan, and follows through step by step. This mode is built for accuracy, analysis, and complex reasoning. It’s slower and more expensive, but if you want a model that can handle multi-step research, write thorough strategies, or work inside large codebases, this is the gear you want. Without clear instructions, however, it can end up overthinking and generating more than you bargained for.

When to use:

Complex multi-step problems, long-horizon tasks, architecture, tricky analysis, non-trivial refactors, research synthesis.
Anywhere accuracy, edge cases, and planning matter more than speed.

Strengths:

The highest quality reasoning and decomposition.
Excellent at following multi-stage processes and double-checking work.

Pitfalls:

Slower and more expensive.
If you don’t set stop conditions, it may over-invest in “thoroughness.”

How to prompt for Thinking (with guardrails):

Decompose the request into a numbered plan (max 7 steps).
Before executing, validate the plan against constraints and risks.
After each step, self-check: does this meet the acceptance criteria?
Output passes all format checks.
No TODOs or placeholders remain.
Edge cases addressed: [list your edge cases].
Total tool calls: ≤ 6 unless a blocker is detected.
If a blocker is detected, explain briefly, then request explicit permission to exceed. Pro move: pair Thinking with the Responses API and reuse previous_response_id across turns. This preserves the model’s plan/context without re-paying for it each time.

4) Pro — “All-in performance with ”

Pro mode is the perfectionist. It doesn’t just think deeply like the Thinking mode, it also packages the result neatly, with polish, structure, and presentation. This makes it the mode for high-stakes tasks: executive memos, legal drafts, detailed proposals, or anything where you want the output to look like it came from a professional rather than an assistant. It is the slowest and most resource-intensive option, and sometimes it’s simply overkill, but when the final product matters, this is the mode that delivers.(Only available in pro plan(200$/month))

When to use:

High-stakes deliverables where both the thinking and the final writing must be excellent: investor memos, legal policy drafts, system designs pitched to executives, production-grade code proposals.
Scenarios where tone, formatting, and completeness are part of the spec.

Strengths:

Combines depth with editorial quality and strong instruction adherence.
Great at keeping the big picture while nailing the details.

Pitfalls:

Overkill for quick tasks.
If your prompt is vague, you’ll pay more to get a beautifully formatted shrug.

How to prompt for Pro (quality spec):

Structure: executive summary → details → risks → next steps.
Evidence: cite sources or assumptions inline; mark assumptions clearly.
Style: plain English, active voice, short paragraphs.
Review pass: do a final contradiction scan; align with acceptance criteria.
Provide a clean deliverable plus a 5-line TL;DR.
Include a checklist the reader can act on immediately.

Prompt Patterns That Work in GPT-5 (And Why)

1) Contracts beat vibes

Define output contracts, budgets, stop conditions, and escape hatches (“Proceed even if not fully certain”). GPT-5 respects contracts.

2) Remove contradictions at the source

If your process has a real exception (“In emergencies, skip patient lookup and give 911 guidance”), write that exception explicitly. The model is obedient; don’t make it choose between rule parents.

3) Use “tool preambles” to keep humans oriented

If your agent will run multiple steps or edit files, have it front-load the plan and narrate progress succinctly.

Rephrase the user goal briefly.
Outline a step-by-step plan.
Announce each tool call and why.
Summarize: planned vs. completed. 4) Control eagerness, don’t just complain about it Too much “research”? Tighten <context_gathering>. Too many clarifying questions? Add <persistence> and tell it to assume and proceed.

5) Verbosity is a dial

There’s a verbosity parameter (final answer length) and your prompt can override it locally. For instance: global verbosity: low, but inside code tools ask for “high verbosity” diffs and comments. You’ll get tight narration, but richly explained code edits.

Practical Snippets You Can Paste Today

A. The “Atomic Task” Skeleton (great for Instant/Auto)
Task: Rewrite the following draft into a 120–150 word LinkedIn post for non-technical founders.
Include: a hook (1 sentence), 3 value bullets, 1 CTA sentence.
Constraints: avoid buzzwords; no hashtags; plain English.
Stop when: length and structure are satisfied.
If uncertain: proceed with best-effort and leave a one-line note of assumptions at the end.
Draft:
[PASTE]

B. The “One-Batch Research” Guardrail (Auto)
<context_gathering>

Run exactly one parallel batch of searches.
Read top 3 hits per query; deduplicate.
Early stop when ~70% of sources converge on the same answer.
No second batch unless a contradiction blocks action. </context_gathering>

C. The “Thinking Pass” for Complex Work
<planning>
1) Draft a mini-rubric (hidden) with 5–7 criteria for excellence.
2) Produce a numbered plan (≤7 steps) to satisfy the rubric.
3) Execute step-by-step; after each, self-check against the rubric.
4) At the end, summarize residual risks or uncertainties.
</planning>

D. The “Code Edit Rules” (Frontend example)
<code_editing_rules>

Clarity first: descriptive names, small components, minimal props.
Consistency: Tailwind spacing multiples of 4; 1 neutral + ≤2 accents.
Stack defaults: Next.js (TS), Tailwind, shadcn/ui, Lucide, Zustand.
Directory: /src/app, /components, /hooks, /stores, /lib, /types.
Deliverables: a focused diff + short rationale + test notes. </code_editing_rules>

E. The “Finish Line” Contract (for any mode)
<acceptance_criteria>

Format exactly as specified below.
No placeholders or TODOs.
Edge cases addressed: [list].
Final: a 5-line TL;DR the reader can act on. </acceptance_criteria>

Cookbook-Style Moves (Lifted from Real Usage)

1) Persist reasoning between steps (Responses API)

When your agent must call tools across turns, reuse previous_response_id. You’ll avoid “re-planning tax” and keep latency sane on long tasks.

2) Minimal Reasoning ≠ Minimal Guidance

If you choose minimal/low reasoning for speed, compensate with more explicit planning in the prompt. Example: ask the model to output a 4-bullet “what I’m about to do” before it does it. Your bullets become its scaffolding.

3) Calibrate questions vs. assumptions

If you hate clarifying questions mid-flow, say so:
“Do not hand back to the user for confirmation; make the most reasonable assumption and proceed. Document assumptions at the end.”

4) Prefer local verbosity overrides

Short final answers, verbose code diffs. Or vice versa. Tell it where to spend words.

5) Use metaprompting to improve your prompt

Stuck? Ask GPT-5: “Given this prompt and this undesired behavior, what minimal edits would you make to elicit [desired behavior]?” You’ll get direct, actionable nips and tucks instead of a full rewrite.

A Guided Tour: Matching Tasks to Modes

Task	Mode	Why	Prompt Tip
Rephrase email, fix tone	Instant	Clear target, low risk	Define length, audience, tone, and stop condition
Draft landing page copy	Auto	Some thinking, some speed	Provide section outline + word budgets per section
Compare 3 vendor APIs, decide	Thinking	Research + synthesis + decision	One-batch research + acceptance criteria + TL;DR
Large refactor (multi-file)	Thinking/Pro	Planning, code quality, diffs	Code rules + plan + test notes + finish line
Exec memo with risks & next steps	Pro	Quality + polish + structure	Quality bar + handoff checklist
Bug triage in a repo	Auto → Thinking if complex	Start lean, deepen if needed	Plan first; escalate only once

Fixing the Five Classic Anti-Patterns

“Write me a strategy”
- Better: “Write a 7-step GTM strategy for a 2-person SaaS selling a $29/mo analytics add-on to Shopify stores. Each step ≤120 words; include risks and a day-by-day 14-day plan.”
“Be creative but concise but also very detailed”
- Pick a lane. If you must blend: “Use crisp, concrete language; short paragraphs; examples over adjectives. Max 500 words. Include 2 concrete examples.”
“Summarize this PDF” (no audience)
- Better: “Summarize for CFO who needs to decide by Friday. Extract 3 numbers, 3 risks, 3 upside factors. Max 200 words + 5-line TL;DR.”
“Improve the code” (no definition of “improve”)
- Better: “Refactor for readability and testability: extract components, remove dead code, add 3 unit tests. Keep behavior identical. Explain diff in 5 bullets.”
“Research everything about X”
- Better: “Run one batch of searches; read top 3 results; stop when sources converge. Produce a 10-bullet executive summary + links + open questions.”

Tiny Prompts That Save Hours

“Assume don’t ask” switch:
- “Do not ask for clarifications mid-task. When uncertain, choose the most reasonable assumption, proceed, and log assumptions in the final summary.”
“Don’t wander” leash:
- “Limit context gathering to one batch; no second batch unless contradictions block progress.”
“No fluff” pressure:
- “Plain English. No metaphors. No analogies unless requested. Replace adjectives with numbers or examples.”
“Rubric-first build” for greenfield:
- “Before building, silently create a 6-point rubric for a world-class [deliverable]. Use it to self-check each step; do not show the rubric.”
“Finish line” clarity:
- “Stop only when the deliverable meets the acceptance criteria; otherwise continue iterating.”

A Word on Tone, Humor, and Human-ness

You’ll notice GPT-5 can feel a little… serious. If your audience expects warmth or punch, ask for it:

“Friendly, lightly humorous tone.”
“Speak directly to the reader (‘you’), like a helpful colleague.”
“Short paragraphs. Occasional one-liner for levity.”

And if you want zero fluff: say “No jokes, no rhetorical questions.” The model will comply. It’s obedient, not psychic.

Putting It All Together: A Mini Playbook

Choose the model:
- Instant: quick, bounded tasks.
- Auto: default for most.
- Thinking: complex, multi-step, accuracy-critical.
- Pro: high-stakes, polished deliverables.
Define the deliverable:
- Audience, length, structure, constraints, finish line.
Steer eagerness:
- <context_gathering> for budgets and early stops.
- <persistence> for autonomy and fewer clarifying questions.
Specify quality:
- <quality_bar> and <acceptance_criteria>.
- Verbosity rules (where to be terse vs. detailed).
Use tool preambles:
- Plan → narrate → summarize. Humans love to see the map.
Persist reasoning across turns:
- Responses API + previous_response_id. Pay once, reuse the plan.
Keep prompts contradiction-free:
- Resolve rule conflicts up front. Explicit exceptions beat implicit hope.
Iterate:
- When results disappoint, ask GPT-5 to propose minimal prompt edits to reach your desired behavior. Ship the improved version.

The Punchline Without the Punch

If GPT-4o was your friendly generalist who tried to read your mind, GPT-5 is your meticulous colleague who will do exactly what you asked — and look at you expectantly if what you asked was unclear. It’s smarter, more persistent, and wildly capable. But it is not a magician. Vague in, vague out.

Write contracts, not vibes. Choose the right mode. Set budgets and finish lines. Give it a quality bar to clear and an escape hatch when uncertainty isn’t worth the wait. Do that, and GPT-5 stops feeling “worse than before” and starts feeling like the teammate you brag about.

If you still want “somewhere nice” without directions, I hear the parking lot has great lighting.

DEV Community

ChatGPT-5 Is Amazing. You don’t know how to use this properly. I show you why.

What Changed: From a Zoo of Models to One Agentic System

“My Results Are Terrible” — Why That Happens with GPT-5

The Modes: Instant, Auto, Thinking, Pro — What They Actually Do

1) Instant — “Answer fast, don’t overthink it”

2) Auto — “Pick the right depth for me”

3) Thinking — “Take your time, reason deeply, chain steps”

4) Pro — “All-in performance with ”

Prompt Patterns That Work in GPT-5 (And Why)

1) Contracts beat vibes

2) Remove contradictions at the source

3) Use “tool preambles” to keep humans oriented

5) Verbosity is a dial

Practical Snippets You Can Paste Today

Cookbook-Style Moves (Lifted from Real Usage)

1) Persist reasoning between steps (Responses API)

2) Minimal Reasoning ≠ Minimal Guidance

3) Calibrate questions vs. assumptions

4) Prefer local verbosity overrides

5) Use metaprompting to improve your prompt

A Guided Tour: Matching Tasks to Modes

Fixing the Five Classic Anti-Patterns

Tiny Prompts That Save Hours

A Word on Tone, Humor, and Human-ness

Putting It All Together: A Mini Playbook

The Punchline Without the Punch

Top comments (0)