Most people use Codex like a smarter autocomplete.
That is usually where the frustration starts.
Codex works much better when you treat it like a teammate with access to your repo, tools, tests, and instructions. Once that clicks, the quality of the output changes fast.
This guide is for both camps:
- the person opening Codex for the first time
- the person who already tried it, got mixed results, and wants a workflow they can trust
I verified this guide against official OpenAI Codex docs and Help Center material on April 2, 2026. Since Codex moves fast, that matters.
What Codex Actually Is
Codex is OpenAI's coding agent.
It can work in your terminal, inside supported IDEs, inside the Codex app, and in cloud-backed workflows. According to OpenAI's Help Center, you can use it to write code, review changes, run commands, execute tests, and delegate work in isolated sandboxes.
That means the right mental model is not:
Ask for code and hope for the best.
It is:
Give Codex the task, the context, the rules, and the definition of done.
That one shift fixes a surprising number of bad Codex sessions.
Quick Start in 5 Minutes
As of April 2, 2026, OpenAI's Help Center says Codex is included with ChatGPT Plus, Pro, Business, and Enterprise/Edu plans, and temporarily also included with Free and Go.
If you want the CLI, the official install command is:
npm i -g @openai/codex
Then run:
codex
The first run prompts you to sign in with your ChatGPT account or an API key.
OpenAI's CLI docs also note:
- CLI support is available on macOS and Linux
- Windows support is experimental
- WSL is the recommended path on Windows for the best experience
If you prefer GUI-heavy workflows, the Codex app is worth a look. OpenAI positions it as the place for multiple parallel agents, worktrees, automations, and built-in git flows. If you already live in VS Code, Cursor, or Windsurf, the IDE extension is the most natural entry point.
My recommendation is simple:
- Start with CLI if you want to learn the fundamentals
- Use the IDE extension if most of your work is file-by-file editing
- Use the Codex app when you want multiple concurrent threads, worktrees, or automations
The Mental Model That Makes Codex Better
OpenAI's best practices guide gives a simple default structure for prompts. Use these four parts every time:
| Part | What it means |
|---|---|
| Goal | What exactly should change? |
| Context | Which files, folders, docs, errors, or examples matter? |
| Constraints | What rules, architecture, safety limits, or conventions must be followed? |
| Done when | How do we know the task is complete? |
That is the whole game.
Most bad Codex output comes from missing one of these.
For example, this is weak:
Fix auth.
This is much better:
Goal: Fix the login redirect loop for authenticated users.
Context: The issue is in `src/middleware.ts`, `src/lib/auth.ts`, and the `/login` flow. It started after the session cookie rename.
Constraints: Do not change the database schema. Keep the current JWT strategy. Avoid unrelated refactors.
Done when: Logged-in users can refresh `/dashboard` without being redirected to `/login`, and the auth test suite passes.
That prompt is not fancy. It is just clear.
Codex rewards clarity much more than clever prompting.
Your First Good Prompts
If you are new, save these.
1. Fix a Bug
Goal: Fix a bug in [feature].
Context: The problem appears in [files]. The current behavior is [bad behavior]. The expected behavior is [expected behavior]. Relevant error/output: [paste it].
Constraints: Keep the change minimal. Do not rename public APIs. No unrelated formatting changes.
Done when: The bug no longer reproduces, the relevant tests pass, and you explain the root cause in plain English.
2. Build a Feature
Goal: Implement [feature].
Context: Follow the existing patterns in [files/components]. Use [library/tool] if possible. The UI/API should match [reference].
Constraints: Keep this scoped to [files or directories]. Add tests. Do not change unrelated modules.
Done when: The feature works end to end, tests pass, and the final diff is limited to the intended files.
3. Refactor Safely
Goal: Refactor [module] for maintainability.
Context: Start by understanding the current flow in [files]. Identify risks before editing. There are existing callers in [paths].
Constraints: Preserve behavior. No public API breaks. Update tests if needed.
Done when: The code is simpler, behavior is unchanged, and you summarize what changed and what was intentionally left alone.
4. Debug Before Editing
Do not edit anything yet.
First, inspect the relevant files, explain the likely root causes, rank them by confidence, and propose the smallest safe fix.
Only after that, ask for confirmation or proceed with the smallest fix if confidence is high.
That last one is underrated.
A lot of Codex frustration comes from asking for implementation before understanding.
Beginner Workflow: What To Do on Real Tasks
If you are just starting, use this sequence.
Step 1: Ask for a plan on anything non-trivial
OpenAI explicitly recommends planning first for difficult or ambiguous tasks. In Codex, Plan mode exists for exactly this reason.
If the task is bigger than a quick bug fix, do not start with code generation. Start with:
Use plan mode. Inspect the relevant files, ask clarifying questions if needed, then propose the implementation plan before writing code.
That reduces wasted edits.
Step 2: Point Codex at the right files
Large repos are where vague prompting gets expensive.
Tell Codex where to look:
- exact files
- exact folder
- exact failing test
- exact error message
- exact screenshot or spec
Do not make it guess the neighborhood.
Step 3: Tell it what not to touch
This is where many people lose control of the diff.
If you do not want a refactor, say so.
If you do not want renamed files, say so.
If you do not want dependency changes, say so.
A lot of Codex quality is really scope control.
Step 4: Tell it how to verify the work
OpenAI's best practices guide is very clear here: do not stop at asking Codex to make a change. Ask it to create tests when needed, run checks, confirm the behavior, and review the result.
A solid line to add is:
Run the relevant tests, lint/type checks if applicable, and review the diff for regressions before you consider the task complete.
Step 5: Review like a teammate wrote it
Codex is good. It is not exempt from review.
You should still check:
- did it solve the actual problem?
- did it change more than necessary?
- did it preserve existing behavior?
- did it add the right tests?
- did it quietly break something adjacent?
If you accept output without review, the problem is not Codex. The problem is process.
The First Two Files Serious Users Set Up
Once you get one or two good sessions, stop repeating yourself manually.
OpenAI's docs say the next step is reusable guidance through AGENTS.md and durable configuration through config.toml.
1. AGENTS.md
OpenAI describes AGENTS.md as an open-format README for agents. It is the best place to encode how Codex should work in a repo.
A practical starter file should cover:
- repo layout
- build, test, and lint commands
- engineering conventions
- do-not rules
- what done means
- how to verify work
A minimal starter looks like this:
# AGENTS.md
## Repo map
- App code: `src/`
- Tests: `tests/`
- Shared utilities: `src/lib/`
## Commands
- Dev: `pnpm dev`
- Test: `pnpm test`
- Lint: `pnpm lint`
- Typecheck: `pnpm typecheck`
## Rules
- Keep diffs minimal.
- Do not rename public APIs without explicit instruction.
- Follow existing folder conventions.
- Add or update tests for behavior changes.
## Done when
- Relevant tests pass.
- Lint and typecheck pass.
- Final diff is reviewed for regressions.
OpenAI also recommends keeping AGENTS.md short and practical.
That is correct.
Do not turn it into a manifesto.
If Codex makes the same mistake twice, update AGENTS.md. That is a much better loop than rewriting the same instruction in every prompt.
The CLI also has /init to scaffold a starter AGENTS.md.
2. config.toml
A sane starting point for local work is:
model = "gpt-5.4"
approval_policy = "on-request"
sandbox_mode = "workspace-write"
web_search = "cached"
model_reasoning_effort = "high"
[features]
multi_agent = true
shell_snapshot = true
Why these matter:
-
approval_policy = "on-request"keeps you in control without constant friction -
sandbox_mode = "workspace-write"is a good default for normal repo work -
web_search = "cached"uses OpenAI's web search cache by default, which the docs position as safer than blindly fetching arbitrary live pages -
model_reasoning_effortshould go up as tasks get more complex
OpenAI's guidance here is sensible: keep approvals and sandboxing tight by default, then loosen only for trusted repos and specific workflows.
That is the right default attitude.
Intermediate Workflow: How People Move Past One-Off Prompts
This is where Codex starts becoming a real system instead of a novelty.
Use one thread per task
OpenAI explicitly warns against using one giant thread per project.
That leads to bloated context and worse results over time.
Use one thread per coherent task. If the work branches, fork the thread.
Useful session controls from the docs:
-
/forkto branch the conversation while keeping context -
/compactwhen the thread is getting long -
/resumeto pick work back up -
/statusto inspect the current session state
Review inside the workflow
Codex supports review loops, not just code generation.
OpenAI documents /review for reviewing a branch, commit, or uncommitted changes. That is worth using, especially after larger diffs.
A strong pattern is:
- Ask Codex to implement
- Ask Codex to run checks
- Ask Codex to review its own diff against your repo rules
- Then do your human review
Keep verification explicit
If your project has specific commands, put them in AGENTS.md and repeat them in the prompt when needed.
For example:
After the change, run `pnpm test auth`, then `pnpm lint`, then review the diff for auth regressions.
Codex is much more reliable when it can see the finish line.
Advanced Workflow: Worktrees, Subagents, MCP, Skills, Automations
This is where Codex gets genuinely powerful.
1. Worktrees for parallel work
OpenAI's worktree docs are one of the most practical parts of the whole Codex stack.
Worktrees let Codex run multiple independent tasks in the same project without interfering with each other. The docs frame Local as your foreground workspace and Worktree as a background workspace.
That matters because the common mistake is obvious:
- one active task in your local checkout
- another Codex task editing the same branch
- confusion, collisions, messy git state
Worktrees fix that.
OpenAI's documented advantages are straightforward:
- work in parallel without disturbing local setup
- queue background work while you stay focused on the foreground
- hand work back into local later when you are ready to inspect or test
If you want Codex doing one larger task while you keep shipping something else, use a worktree.
2. Subagents for truly parallel tasks
OpenAI says Codex can spawn specialized agents in parallel and then consolidate the results.
This is useful when the task is actually parallel, not just large.
Good examples:
- one agent reviews security
- one agent reviews bugs
- one agent inspects flaky tests
- one agent maps the codebase around a subsystem
Bad example:
- splitting a tightly coupled change across five agents that all want the same files
OpenAI also notes two important constraints:
- subagents only run when you explicitly ask for them
- they cost more tokens than a single-agent run
Use them when parallelism is real, not because it sounds advanced.
3. MCP when the context lives outside the repo
OpenAI's docs are crisp on MCP: use it when the context Codex needs lives outside the repo and changes frequently.
That means things like:
- internal docs
- ticketing systems
- dashboards
- design systems
- external APIs
- runbooks
If you keep pasting the same outside context into prompts, that is usually an MCP smell.
OpenAI's warning here is also important: do not wire in every tool on day one. Start with one or two tools that remove a real manual loop.
That is good advice. Tool sprawl makes agents worse, not better.
4. Skills when you repeat the same workflow
OpenAI's rule of thumb is excellent:
If you keep reusing the same prompt or correcting the same workflow, it should probably become a skill.
Skills are great for:
- incident summaries
- release note drafting
- repeated debugging flows
- checklist-based PR reviews
- migration planning
Keep each skill narrow. One job. Clear input. Clear output.
5. Automations when the workflow is stable
OpenAI puts this nicely: skills define the method, automations define the schedule.
That is the right order.
Do not automate a workflow that still needs a lot of steering.
Once it is predictable, automations are useful for things like:
- CI failure summaries
- recent commit summaries
- scheduled repo health checks
- standup drafts
- recurring analysis jobs
Why Codex Gets Stuck, And How To Recover Fast
This is the section you will come back to.
Symptom: Codex makes a big messy diff
Likely cause:
- the prompt was too open-ended
- no file scope was given
- no constraints were stated
Fix:
Redo this with the smallest safe change.
Only edit the files directly involved.
Do not refactor unrelated code.
Explain the exact files you plan to change before editing.
Symptom: Codex edits before understanding the bug
Likely cause:
- you asked for implementation too early
Fix:
Stop coding.
Inspect the relevant files and logs first.
List the most likely root causes, rank them by confidence, and propose the smallest fix.
Do not edit anything until that analysis is complete.
Symptom: Codex solves the wrong problem
Likely cause:
- the goal was vague
- done criteria were missing
Fix:
Reset the task.
Goal: [one sentence]
Context: [files, errors, references]
Constraints: [rules]
Done when: [testable outcomes]
Repeat your understanding of the task before making changes.
Symptom: Codex keeps repeating the same mistakes across sessions
Likely cause:
- repo rules live only in your head
Fix:
Move the rule into AGENTS.md.
This is exactly what OpenAI recommends. Repeated friction should become reusable guidance.
Symptom: Codex is good on small tasks and weak on bigger ones
Likely cause:
- you skipped planning
- the thread is too broad
- the task should be broken down
Fix:
- use Plan mode
- split the work into smaller tasks
- use one thread per task
- fork when the work branches
Symptom: Two Codex tasks step on each other
Likely cause:
- multiple live threads on the same files or branch
Fix:
Use git worktrees.
OpenAI explicitly warns against running live threads on the same files without worktrees.
Symptom: Codex cannot verify the result well
Likely cause:
- the repo does not expose clear commands
- test and lint steps were not provided
Fix:
Put the actual commands in AGENTS.md, then restate them in the task.
Symptom: Codex needs information from outside the repo
Likely cause:
- missing external context
Fix:
Use MCP for repeatable outside context. If it is a one-off research task, use web search carefully. OpenAI's config docs note that cached web search is the default and should still be treated as untrusted.
The Copy-Paste Rescue Prompts
These are the ones I would actually keep in a note.
Rescue Prompt 1: Plan Before Touching Code
Use plan mode.
Inspect the relevant files first, ask any clarifying questions, and propose the implementation plan.
Do not write code yet.
Rescue Prompt 2: Smallest Safe Fix
Fix this with the smallest safe change.
No unrelated refactors.
No dependency changes.
No file moves unless absolutely necessary.
Explain the intended diff before editing.
Rescue Prompt 3: Understand Before Editing
First explain:
1. what the current code is doing
2. where the bug most likely is
3. what the smallest fix is
4. what could regress if we change it
Then implement only after that analysis.
Rescue Prompt 4: Tight Review Loop
After making the change:
- add or update tests if needed
- run the relevant test, lint, and typecheck commands
- review the diff for regressions
- summarize what changed, what was verified, and any remaining risk
Rescue Prompt 5: Stay Inside the Lane
Only work in these files:
- `src/...`
- `tests/...`
If you believe another file must change, stop and justify it first.
Rescue Prompt 6: Turn This Into Durable Guidance
You made the same mistake twice on this repo.
Write a short retrospective and propose the exact `AGENTS.md` update that would prevent it next time.
That last prompt is how you stop paying the same tax repeatedly.
What Advanced Users Usually Figure Out
After enough Codex usage, the lessons are consistent.
1. Better repos get better Codex output
If your project has:
- clear structure
- real tests
- reliable commands
- a useful
AGENTS.md - stable conventions
Codex looks much smarter.
That is not magic. The environment is simply legible.
2. Planning is not overhead
People skip planning because they want speed.
Then they burn that time back in rework.
OpenAI's documentation leans hard toward planning for difficult tasks, and I think that is the correct default.
3. Reusability beats prompt gymnastics
A reusable AGENTS.md, one or two good skills, and a sane config file will outperform heroic one-off prompts over time.
4. Parallelism only helps when the task is actually parallel
Use worktrees and subagents when the work can truly branch.
Do not force parallelism into tightly coupled edits.
5. Codex is strongest when it can inspect, change, run, and verify
The more artificial the environment, the worse the results.
If Codex cannot see the repo properly, cannot run the right commands, or cannot verify success, you are leaving a big part of the product unused.
Final Thoughts
If you only remember three things from this guide, make it these:
- Use Goal + Context + Constraints + Done when in every serious prompt.
- Put recurring rules into
AGENTS.mdinstead of repeating yourself forever. - Ask Codex to plan, verify, and review, not just generate code.
That is the difference between random AI help and a workflow you can actually depend on.
Codex is not hard to use.
But it is easy to use badly.
Once you stop treating it like a code slot machine and start treating it like an engineer with tools, it becomes much more useful.
Official Sources
- OpenAI Codex best practices
- OpenAI Codex CLI docs
- OpenAI Codex config basics
- OpenAI Codex worktrees docs
- OpenAI Codex subagents docs
- OpenAI Help Center: Using Codex with your ChatGPT plan
- OpenAI: Introducing upgrades to Codex
If OpenAI changes the product behavior after April 2, 2026, treat the links above as the source of truth and update your workflow accordingly.
Top comments (0)