You may have seen the headline. Sam Altman said that saying "please" and "thank you" to ChatGPT costs OpenAI tens of millions of dollars, and that it is money well spent. Everyone had a take. The discourse ran its course.
Most of it missed the more interesting optimization.
I Kept Noticing Something
For weeks, I watched a small note appear in the thinking panel whenever I answered a model's question with a short reply: Clarifying user's ambiguous response. It showed up when I typed "yes." It showed up when I typed "sure," "maybe," and "go ahead." It happened often enough that I stopped treating it as UI decoration and started treating it as a signal.
On reasoning-capable models, ambiguity is not free. OpenAI and Anthropic both document that reasoning or thinking tokens are billed as output tokens. If your reply is underspecified, the model may spend extra compute resolving what you meant before it can do the actual work.
What Is Actually Happening
When a model asks a question and you answer with "yes," your reply may not carry its own referent. Yes to what? Write the plan? Run the migration? Use option B? Keep the current API?
Sometimes the surrounding context is enough. Sometimes it is not. In either case, you are asking the model to recover the missing structure instead of giving it the instruction directly.
A fuller reply such as "Yes, write the plan" reduces that ambiguity sharply. The model has less inference work to do, the transcript stays legible, and the next turn starts from cleaner state.
This is the part the politeness debate mostly missed. "Please" usually adds a cheap input token or two. An ambiguous confirmation can cost more because it may trigger extra reasoning and it almost always leaves behind a worse conversation record.
More Words Is Not the Answer
The solution is not verbosity. A reply like "Yes, I would absolutely love it if you would go ahead and write the comprehensive plan we discussed earlier" is clear enough, but it adds ballast.
That ballast compounds in long sessions. In Claude Code, Cursor, or any API-connected workflow where you are paying attention to tokens, low-signal filler adds cost and makes the transcript harder to reuse. Different systems trim, summarize, and cache context differently, but the practical rule is the same: extra noise is still noise.
The target is precision. Use the minimum number of tokens that fully removes ambiguity. Not fewer. Not more.
The Cost That Does Not Show Up on Your Invoice
There is a slower consequence that matters even more for long sessions.
A conversation history full of "yes," "ok," "sounds good," and "sure" is a weak record of decisions. It does not tell the model what you approved, what you rejected, what you deferred, or what you approved with conditions. Decisions start to look identical to acknowledgments.
This is how sessions degrade without a dramatic failure. The model re-derives conclusions it already reached. It hedges on things you settled an hour ago. Responses get longer and less useful. The context is not necessarily full. It is low signal.
A clean conversation history is one of the few optimizations that improves both cost and quality over time.
The Fix
When the model asks you something, reflect the decision back in your answer.
"Yes, write the plan."
"No, keep the current API."
"Proceed with option B, but skip the migration for now."
Those replies take a few extra seconds to type, but they reduce ambiguity, make the transcript self-explanatory, and give the next turn a better starting point.
This is not a rule about politeness or formality. It is a rule about precision. Engineers already apply this instinct everywhere else: a named variable beats a magic number, an explicit condition beats an implicit one, and a self-documenting function call beats a comment explaining an opaque one. The same discipline improves your prompts.
The Actual Hierarchy
From best to worst for long-running sessions:
- "Yes, write the plan." Clear instruction. Low input cost. Clean transcript.
- "Please write the plan." Also clear. Slightly more input cost. Usually fine.
- "Yes." Cheap to type, expensive in ambiguity. The model has to infer what you approved.
- "Yes, absolutely, that sounds great, please go ahead." Clear enough, but bloated. Better than ambiguous, worse than precise.
The debate about politeness was a distraction. The habit worth examining has been sitting in your response field the whole time.
Stop typing just "yes." Type the decision.
Scott Bishop is an engineer with 30 years of experience building regulated, high-trust systems across international finance, federal government, and the Fortune 500.
Top comments (0)