Codex in 2026: A Practical First-Hour Walkthrough
I keep getting asked "what does a first session with Codex actually look like?" — so here's a real, unedited run from this week, with the exact prompts and the gotchas I hit.
What Codex is, in one sentence
It's an agent that lives in your terminal and in your editor. You give it a goal, it does the work, you review the diff.
My first hour, step by step
- Install + auth. Five minutes, including the OAuth dance.
-
Pick a small, well-scoped task. I chose "add a
--dry-runflag to my deploy script". Bad first task would be "refactor my whole backend." - Write a one-line goal. Not a paragraph. Add --dry-run to scripts/deploy.sh that prints the commands without running them. Update the README.
- Watch, but don't interrupt. The first time I used Codex I kept tabbing in to "help." That's a trap — let it finish.
- Review the diff like a human PR. I caught two bugs it introduced. Both real, both subtle.
The 3 gotchas nobody warns you about
- It will over-engineer if you let it. Cap the scope in your prompt.
- It doesn't know your secret conventions. Tell it up front.
- It will confidently make up API names. Always run the code.
TL;DR for a busy dev
Codex is a real productivity unlock, but only if you treat it like a junior: tight scope, fast review, trust but verify. After a week you'll wonder how you shipped without it.
Top comments (1)
This is hardly a blocker for Codex and similar on their way to the kingdom of over-engineering :) They will do it anyway, especially when facing issues. Just look into the code from time to time to prevent it. Or rewrite when prevention has failed.