Stop prompting Codex like ChatGPT

#ai #openai #programming #productivity

There's a pattern I see from almost every developer who tries Codex for the first time and walks away underwhelmed.
They open it up, type something like:

"Hey, can you help me refactor my authentication module to use the new JWT library and also update the tests and maybe clean up the error handling while you're at it?"

Codex starts. It works for a while. Then it returns something that's fine but not complete in itself. They tweak it in chat. It gets better, then breaks something else. Twenty minutes later they're in a back-and-forth loop that feels slower than just doing it themselves.

Here's the thing: that's not a Codex problem. That's a ChatGPT habit applied to the wrong tool. (YES, habits compound)

ChatGPT is a conversation. You talk, it responds, you refine. The back-and-forth is the interface. Ambiguity is fine — you'll clarify it next message.
Codex is an agent. It reads your task, opens your repo, writes code, runs commands, checks the output, and commits a result. It's not waiting for your next message. It's working.

The big mindset shift is this:

ChatGPT is great for conversation. Codex is great for work.

You can brainstorm with Codex. You can ask questions. You can explore options. B*ut Codex becomes much more useful when you stop prompting it like a chatbot and start assigning it clear, bounded tasks.*

I personally like giving Codex atomic tasks. What are atomic tasks? An atomic task has 3 properties:

One clear outcome: you know exactly what "done" looks like
A bounded scope: it touches one module, one feature, one concern
A verifiable result: there's a test, a lint check, or a visible output that confirms success

That's it. If you can't describe the done state in one sentence, the task is big and higher are the chances of you getting frustrated in the longer run.

Atomic does not mean tiny. It means bounded.

The Anatomy Of A Good Codex Prompt

A good Codex prompt usually has four parts:

Context: What is happening now?
Goal: What should be true after the change?
Constraints: What should Codex preserve or avoid?
Verification: How should Codex check the work?

For example:

The export button currently downloads an empty CSV when filters are applied. Fix the export logic so it respects active filters. Keep the existing CSV column order unchanged. Add or update tests for filtered exports, then run the relevant test command.

This prompt is short, but it gives Codex almost everything it needs.

The context is the broken export behavior.
The goal is filtered CSV export.
The constraint is preserving the existing format.
The verification is tests plus the relevant command.

That is a real task.

ChatGPT-style prompt	Codex-style task
Add rate limiting to the API. We're getting hammered and need to protect the endpoints.	Add per-IP rate limiting to the /api/search endpoint using the existing express-rate-limit package (already in package.json). Limit: 30 requests per minute. On exceed: return 429 with { error: 'rate_limit_exceeded' }. Add a test in tests/search.test.ts that verifies the 429 response on the 31st request.
We need to migrate from the old OpenAI SDK to the new Responses API. Can you update the codebase?	In src/services/completion.ts only, migrate the getCompletion() function from openai.createCompletion() (legacy) to openai.responses.create() (Responses API). Map the parameters according to this table: [table]. Keep the existing function signature so callers don't change. Run the existing unit tests for this file after the change

Not everything needs precision

Not every Codex task needs military precision. For exploratory work like "build a prototype of X" or "show me what this codebase does" — broader prompts are fine, because you're not expecting a final piece or something that you can just deploy ASAP.

The rule of thumb: precision scales with consequence.

BUT, the unlock with Codex isn't better prompting - it's better decomposition. Before you open Codex, spend 2 minutes answering:

What is the smallest independently verifiable unit of this work?
What files does it touch?
What does done look like?
What's out of scope?

If you're migrating an SDK across 40 files, that's not one task. It's 40 tasks, or at least 8 grouped by module, each with its own checkpoint. Run them in parallel. Review each one. Merge when clean.

This is also why Codex's Skills feature exists. Once you've figured out the right task shape for something you do repeatedly you save that pattern as a skill and invoke it by name next time. The decomposition work pays forward.

Want to go deeper? Check out the Codex Prompting Guide and the AGENTS.md docs, both are worth reading before you set up your next project.

As for Codex Skills, I will cover that in my next blog.

Regards,
Joe.
I wrote this. An agent would’ve also fixed the bugs.