Muhammad Ali Kazmi

Posted on Apr 2

The Complete Beginner to Advanced Guide to ChatGPT Codex

#ai #beginners #productivity #chatgpt

Most people use Codex like a smarter autocomplete.

That is usually where the frustration starts.

Codex works much better when you treat it like a teammate with access to your repo, tools, tests, and instructions. Once that clicks, the quality of the output changes fast.

This guide is for both camps:

the person opening Codex for the first time
the person who already tried it, got mixed results, and wants a workflow they can trust

I verified this guide against official OpenAI Codex docs and Help Center material on April 2, 2026. Since Codex moves fast, that matters.

What Codex Actually Is

Codex is OpenAI's coding agent.

It can work in your terminal, inside supported IDEs, inside the Codex app, and in cloud-backed workflows. According to OpenAI's Help Center, you can use it to write code, review changes, run commands, execute tests, and delegate work in isolated sandboxes.

That means the right mental model is not:

Ask for code and hope for the best.

It is:

Give Codex the task, the context, the rules, and the definition of done.

That one shift fixes a surprising number of bad Codex sessions.

Quick Start in 5 Minutes

As of April 2, 2026, OpenAI's Help Center says Codex is included with ChatGPT Plus, Pro, Business, and Enterprise/Edu plans, and temporarily also included with Free and Go.

If you want the CLI, the official install command is:

npm i -g @openai/codex

Then run:

codex

The first run prompts you to sign in with your ChatGPT account or an API key.

OpenAI's CLI docs also note:

CLI support is available on macOS and Linux
Windows support is experimental
WSL is the recommended path on Windows for the best experience

If you prefer GUI-heavy workflows, the Codex app is worth a look. OpenAI positions it as the place for multiple parallel agents, worktrees, automations, and built-in git flows. If you already live in VS Code, Cursor, or Windsurf, the IDE extension is the most natural entry point.

My recommendation is simple:

Start with CLI if you want to learn the fundamentals
Use the IDE extension if most of your work is file-by-file editing
Use the Codex app when you want multiple concurrent threads, worktrees, or automations

The Mental Model That Makes Codex Better

OpenAI's best practices guide gives a simple default structure for prompts. Use these four parts every time:

Part	What it means
Goal	What exactly should change?
Context	Which files, folders, docs, errors, or examples matter?
Constraints	What rules, architecture, safety limits, or conventions must be followed?
Done when	How do we know the task is complete?

That is the whole game.

Most bad Codex output comes from missing one of these.

For example, this is weak:

Fix auth.

This is much better:

Goal: Fix the login redirect loop for authenticated users.
Context: The issue is in `src/middleware.ts`, `src/lib/auth.ts`, and the `/login` flow. It started after the session cookie rename.
Constraints: Do not change the database schema. Keep the current JWT strategy. Avoid unrelated refactors.
Done when: Logged-in users can refresh `/dashboard` without being redirected to `/login`, and the auth test suite passes.

That prompt is not fancy. It is just clear.

Codex rewards clarity much more than clever prompting.

Your First Good Prompts

If you are new, save these.

1. Fix a Bug

Goal: Fix a bug in [feature].
Context: The problem appears in [files]. The current behavior is [bad behavior]. The expected behavior is [expected behavior]. Relevant error/output: [paste it].
Constraints: Keep the change minimal. Do not rename public APIs. No unrelated formatting changes.
Done when: The bug no longer reproduces, the relevant tests pass, and you explain the root cause in plain English.

2. Build a Feature

Goal: Implement [feature].
Context: Follow the existing patterns in [files/components]. Use [library/tool] if possible. The UI/API should match [reference].
Constraints: Keep this scoped to [files or directories]. Add tests. Do not change unrelated modules.
Done when: The feature works end to end, tests pass, and the final diff is limited to the intended files.

3. Refactor Safely

Goal: Refactor [module] for maintainability.
Context: Start by understanding the current flow in [files]. Identify risks before editing. There are existing callers in [paths].
Constraints: Preserve behavior. No public API breaks. Update tests if needed.
Done when: The code is simpler, behavior is unchanged, and you summarize what changed and what was intentionally left alone.

4. Debug Before Editing

Do not edit anything yet.
First, inspect the relevant files, explain the likely root causes, rank them by confidence, and propose the smallest safe fix.
Only after that, ask for confirmation or proceed with the smallest fix if confidence is high.

That last one is underrated.

A lot of Codex frustration comes from asking for implementation before understanding.

Beginner Workflow: What To Do on Real Tasks

If you are just starting, use this sequence.

Step 1: Ask for a plan on anything non-trivial

OpenAI explicitly recommends planning first for difficult or ambiguous tasks. In Codex, Plan mode exists for exactly this reason.

If the task is bigger than a quick bug fix, do not start with code generation. Start with:

Use plan mode. Inspect the relevant files, ask clarifying questions if needed, then propose the implementation plan before writing code.

That reduces wasted edits.

Step 2: Point Codex at the right files

Large repos are where vague prompting gets expensive.

Tell Codex where to look:

exact files
exact folder
exact failing test
exact error message
exact screenshot or spec

Do not make it guess the neighborhood.

Step 3: Tell it what not to touch

This is where many people lose control of the diff.

If you do not want a refactor, say so.

If you do not want renamed files, say so.

If you do not want dependency changes, say so.

A lot of Codex quality is really scope control.

Step 4: Tell it how to verify the work

OpenAI's best practices guide is very clear here: do not stop at asking Codex to make a change. Ask it to create tests when needed, run checks, confirm the behavior, and review the result.

A solid line to add is:

Run the relevant tests, lint/type checks if applicable, and review the diff for regressions before you consider the task complete.

Step 5: Review like a teammate wrote it

Codex is good. It is not exempt from review.

You should still check:

did it solve the actual problem?
did it change more than necessary?
did it preserve existing behavior?
did it add the right tests?
did it quietly break something adjacent?

If you accept output without review, the problem is not Codex. The problem is process.

The First Two Files Serious Users Set Up

Once you get one or two good sessions, stop repeating yourself manually.

OpenAI's docs say the next step is reusable guidance through AGENTS.md and durable configuration through config.toml.

1. `AGENTS.md`

OpenAI describes AGENTS.md as an open-format README for agents. It is the best place to encode how Codex should work in a repo.

A practical starter file should cover:

repo layout
build, test, and lint commands
engineering conventions
do-not rules
what done means
how to verify work

A minimal starter looks like this:

# AGENTS.md

## Repo map
- App code: `src/`
- Tests: `tests/`
- Shared utilities: `src/lib/`

## Commands
- Dev: `pnpm dev`
- Test: `pnpm test`
- Lint: `pnpm lint`
- Typecheck: `pnpm typecheck`

## Rules
- Keep diffs minimal.
- Do not rename public APIs without explicit instruction.
- Follow existing folder conventions.
- Add or update tests for behavior changes.

## Done when
- Relevant tests pass.
- Lint and typecheck pass.
- Final diff is reviewed for regressions.

OpenAI also recommends keeping AGENTS.md short and practical.

That is correct.

Do not turn it into a manifesto.

If Codex makes the same mistake twice, update AGENTS.md. That is a much better loop than rewriting the same instruction in every prompt.

The CLI also has /init to scaffold a starter AGENTS.md.

2. `config.toml`

A sane starting point for local work is:

model = "gpt-5.4"
approval_policy = "on-request"
sandbox_mode = "workspace-write"
web_search = "cached"
model_reasoning_effort = "high"

[features]
multi_agent = true
shell_snapshot = true

Why these matter:

approval_policy = "on-request" keeps you in control without constant friction
sandbox_mode = "workspace-write" is a good default for normal repo work
web_search = "cached" uses OpenAI's web search cache by default, which the docs position as safer than blindly fetching arbitrary live pages
model_reasoning_effort should go up as tasks get more complex

OpenAI's guidance here is sensible: keep approvals and sandboxing tight by default, then loosen only for trusted repos and specific workflows.

That is the right default attitude.

Intermediate Workflow: How People Move Past One-Off Prompts

This is where Codex starts becoming a real system instead of a novelty.

Use one thread per task

OpenAI explicitly warns against using one giant thread per project.

That leads to bloated context and worse results over time.

Use one thread per coherent task. If the work branches, fork the thread.

Useful session controls from the docs:

/fork to branch the conversation while keeping context
/compact when the thread is getting long
/resume to pick work back up
/status to inspect the current session state

Review inside the workflow

Codex supports review loops, not just code generation.

OpenAI documents /review for reviewing a branch, commit, or uncommitted changes. That is worth using, especially after larger diffs.

A strong pattern is:

Ask Codex to implement
Ask Codex to run checks
Ask Codex to review its own diff against your repo rules
Then do your human review

Keep verification explicit

If your project has specific commands, put them in AGENTS.md and repeat them in the prompt when needed.

For example:

After the change, run `pnpm test auth`, then `pnpm lint`, then review the diff for auth regressions.

Codex is much more reliable when it can see the finish line.

Advanced Workflow: Worktrees, Subagents, MCP, Skills, Automations

This is where Codex gets genuinely powerful.

1. Worktrees for parallel work

OpenAI's worktree docs are one of the most practical parts of the whole Codex stack.

Worktrees let Codex run multiple independent tasks in the same project without interfering with each other. The docs frame Local as your foreground workspace and Worktree as a background workspace.

That matters because the common mistake is obvious:

one active task in your local checkout
another Codex task editing the same branch
confusion, collisions, messy git state

Worktrees fix that.

OpenAI's documented advantages are straightforward:

work in parallel without disturbing local setup
queue background work while you stay focused on the foreground
hand work back into local later when you are ready to inspect or test

If you want Codex doing one larger task while you keep shipping something else, use a worktree.

2. Subagents for truly parallel tasks

OpenAI says Codex can spawn specialized agents in parallel and then consolidate the results.

This is useful when the task is actually parallel, not just large.

Good examples:

one agent reviews security
one agent reviews bugs
one agent inspects flaky tests
one agent maps the codebase around a subsystem

Bad example:

splitting a tightly coupled change across five agents that all want the same files

OpenAI also notes two important constraints:

subagents only run when you explicitly ask for them
they cost more tokens than a single-agent run

Use them when parallelism is real, not because it sounds advanced.

3. MCP when the context lives outside the repo

OpenAI's docs are crisp on MCP: use it when the context Codex needs lives outside the repo and changes frequently.

That means things like:

internal docs
ticketing systems
dashboards
design systems
external APIs
runbooks

If you keep pasting the same outside context into prompts, that is usually an MCP smell.

OpenAI's warning here is also important: do not wire in every tool on day one. Start with one or two tools that remove a real manual loop.

That is good advice. Tool sprawl makes agents worse, not better.

4. Skills when you repeat the same workflow

OpenAI's rule of thumb is excellent:

If you keep reusing the same prompt or correcting the same workflow, it should probably become a skill.

Skills are great for:

incident summaries
release note drafting
repeated debugging flows
checklist-based PR reviews
migration planning

Keep each skill narrow. One job. Clear input. Clear output.

5. Automations when the workflow is stable

OpenAI puts this nicely: skills define the method, automations define the schedule.

That is the right order.

Do not automate a workflow that still needs a lot of steering.

Once it is predictable, automations are useful for things like:

CI failure summaries
recent commit summaries
scheduled repo health checks
standup drafts
recurring analysis jobs

Why Codex Gets Stuck, And How To Recover Fast

This is the section you will come back to.

Symptom: Codex makes a big messy diff

Likely cause:

the prompt was too open-ended
no file scope was given
no constraints were stated

Fix:

Redo this with the smallest safe change.
Only edit the files directly involved.
Do not refactor unrelated code.
Explain the exact files you plan to change before editing.

Symptom: Codex edits before understanding the bug

Likely cause:

you asked for implementation too early

Fix:

Stop coding.
Inspect the relevant files and logs first.
List the most likely root causes, rank them by confidence, and propose the smallest fix.
Do not edit anything until that analysis is complete.

Symptom: Codex solves the wrong problem

Likely cause:

the goal was vague
done criteria were missing

Fix:

Reset the task.
Goal: [one sentence]
Context: [files, errors, references]
Constraints: [rules]
Done when: [testable outcomes]
Repeat your understanding of the task before making changes.

Symptom: Codex keeps repeating the same mistakes across sessions

Likely cause:

repo rules live only in your head

Fix:

Move the rule into AGENTS.md.

This is exactly what OpenAI recommends. Repeated friction should become reusable guidance.

Symptom: Codex is good on small tasks and weak on bigger ones

Likely cause:

you skipped planning
the thread is too broad
the task should be broken down

Fix:

use Plan mode
split the work into smaller tasks
use one thread per task
fork when the work branches

Symptom: Two Codex tasks step on each other

Likely cause:

multiple live threads on the same files or branch

Fix:

Use git worktrees.

OpenAI explicitly warns against running live threads on the same files without worktrees.

Symptom: Codex cannot verify the result well

Likely cause:

the repo does not expose clear commands
test and lint steps were not provided

Fix:

Put the actual commands in AGENTS.md, then restate them in the task.

Symptom: Codex needs information from outside the repo

Likely cause:

missing external context

Fix:

Use MCP for repeatable outside context. If it is a one-off research task, use web search carefully. OpenAI's config docs note that cached web search is the default and should still be treated as untrusted.

The Copy-Paste Rescue Prompts

These are the ones I would actually keep in a note.

Rescue Prompt 1: Plan Before Touching Code

Use plan mode.
Inspect the relevant files first, ask any clarifying questions, and propose the implementation plan.
Do not write code yet.

Rescue Prompt 2: Smallest Safe Fix

Fix this with the smallest safe change.
No unrelated refactors.
No dependency changes.
No file moves unless absolutely necessary.
Explain the intended diff before editing.

Rescue Prompt 3: Understand Before Editing

First explain:
1. what the current code is doing
2. where the bug most likely is
3. what the smallest fix is
4. what could regress if we change it
Then implement only after that analysis.

Rescue Prompt 4: Tight Review Loop

After making the change:
- add or update tests if needed
- run the relevant test, lint, and typecheck commands
- review the diff for regressions
- summarize what changed, what was verified, and any remaining risk

Rescue Prompt 5: Stay Inside the Lane

Only work in these files:
- `src/...`
- `tests/...`
If you believe another file must change, stop and justify it first.

Rescue Prompt 6: Turn This Into Durable Guidance

You made the same mistake twice on this repo.
Write a short retrospective and propose the exact `AGENTS.md` update that would prevent it next time.

That last prompt is how you stop paying the same tax repeatedly.

What Advanced Users Usually Figure Out

After enough Codex usage, the lessons are consistent.

1. Better repos get better Codex output

If your project has:

clear structure
real tests
reliable commands
a useful AGENTS.md
stable conventions

Codex looks much smarter.

That is not magic. The environment is simply legible.

2. Planning is not overhead

People skip planning because they want speed.

Then they burn that time back in rework.

OpenAI's documentation leans hard toward planning for difficult tasks, and I think that is the correct default.

3. Reusability beats prompt gymnastics

A reusable AGENTS.md, one or two good skills, and a sane config file will outperform heroic one-off prompts over time.

4. Parallelism only helps when the task is actually parallel

Use worktrees and subagents when the work can truly branch.

Do not force parallelism into tightly coupled edits.

5. Codex is strongest when it can inspect, change, run, and verify

The more artificial the environment, the worse the results.

If Codex cannot see the repo properly, cannot run the right commands, or cannot verify success, you are leaving a big part of the product unused.

Final Thoughts

If you only remember three things from this guide, make it these:

Use Goal + Context + Constraints + Done when in every serious prompt.
Put recurring rules into AGENTS.md instead of repeating yourself forever.
Ask Codex to plan, verify, and review, not just generate code.

That is the difference between random AI help and a workflow you can actually depend on.

Codex is not hard to use.

But it is easy to use badly.

Once you stop treating it like a code slot machine and start treating it like an engineer with tools, it becomes much more useful.

Official Sources

If OpenAI changes the product behavior after April 2, 2026, treat the links above as the source of truth and update your workflow accordingly.

Top comments (1)

Harjot Singh • Jun 1

i like how you point out that treating codex as a teammate really improves its effectiveness. it's a game changer when you integrate it into your workflow.

if you're looking for a way to deploy apps quickly, check out moonshift. you can get a full next.js + postgres + auth build up in about 7 minutes, and you own the code on your github. happy to offer a free run if you're interested.

What Codex Actually Is

Quick Start in 5 Minutes

The Mental Model That Makes Codex Better

Your First Good Prompts

1. Fix a Bug

2. Build a Feature

3. Refactor Safely

4. Debug Before Editing

Beginner Workflow: What To Do on Real Tasks

Step 1: Ask for a plan on anything non-trivial

Step 2: Point Codex at the right files

Step 3: Tell it what not to touch

Step 4: Tell it how to verify the work

Step 5: Review like a teammate wrote it

The First Two Files Serious Users Set Up

1. AGENTS.md

2. config.toml

Intermediate Workflow: How People Move Past One-Off Prompts

Use one thread per task

Review inside the workflow

Keep verification explicit

Advanced Workflow: Worktrees, Subagents, MCP, Skills, Automations

1. Worktrees for parallel work

2. Subagents for truly parallel tasks

3. MCP when the context lives outside the repo

4. Skills when you repeat the same workflow

5. Automations when the workflow is stable

Why Codex Gets Stuck, And How To Recover Fast

Symptom: Codex makes a big messy diff

Symptom: Codex edits before understanding the bug

Symptom: Codex solves the wrong problem

Symptom: Codex keeps repeating the same mistakes across sessions

Symptom: Codex is good on small tasks and weak on bigger ones

Symptom: Two Codex tasks step on each other

Symptom: Codex cannot verify the result well

Symptom: Codex needs information from outside the repo

The Copy-Paste Rescue Prompts

Rescue Prompt 1: Plan Before Touching Code

Rescue Prompt 2: Smallest Safe Fix

Rescue Prompt 3: Understand Before Editing

Rescue Prompt 4: Tight Review Loop

Rescue Prompt 5: Stay Inside the Lane

Rescue Prompt 6: Turn This Into Durable Guidance

What Advanced Users Usually Figure Out

1. Better repos get better Codex output

2. Planning is not overhead

3. Reusability beats prompt gymnastics

4. Parallelism only helps when the task is actually parallel

5. Codex is strongest when it can inspect, change, run, and verify

Final Thoughts

Official Sources

1. `AGENTS.md`

2. `config.toml`