10-min read · For developers who already use Codex but feel it could do more
Focus: AGENTS.md • Skills • Automations • Session Management • Cost Optimization
Most people use Codex wrong.
Not because they're doing something obviously incorrect — but because the way Codex works is counterintuitive. You think it's a smarter ChatGPT. It's not. It's a new colleague who walks into your project directory every time with amnesia: knows how to code, but has no idea what your project looks like, what testing framework you use, or which directories are off-limits.
The key to unlocking Codex's full potential isn't better prompts. It's giving it persistent memory of your repo, your conventions, and your workflow.
OpenAI's official Codex documentation distills this into a six-pillar framework and eight common mistakes. I unpacked every single one with real examples. This is the guide I wish I had three months ago.
Before We Start: Understand the Mental Model
Codex doesn't need more instructions. It needs more stable context.
Every time Codex starts working on your project, it reads your files, understands the structure, and gets to work. But without persistent configuration, it forgets everything between sessions. The six pillars are a progressive system to fix that:
AGENTS.md → Config → MCP → Skills → Automations
Each builds on the previous one. You don't need to implement all six on day one.
Pillar 1: Give Context for Every Task
Before every task, make sure Codex knows:
- Project structure: Where is source code, tests, config?
- Tech stack: Language, framework, build tools
- Conventions: Code style, naming, past lessons
- State: What's changed, what errors occurred
You can put this in every prompt. But a much better approach is to encode it into Pillar 2.
Pillar 2: AGENTS.md — Your Project's Manual for AI
AGENTS.md is a README written for AI agents. Codex reads it at the start of every session. This is the single highest-impact thing you can do.
What Goes In It
| Category | Example |
|---|---|
| Directory structure |
src/ is source, db/schema/ is off-limits |
| Start commands |
pnpm dev, docker compose up
|
| Test/build/lint |
pnpm test, pnpm build, pnpm lint
|
| Engineering rules | No any in TypeScript |
| Constraints | Don't modify db/schema/
|
| Acceptance criteria | "Pass pnpm test to be done" |
Three Priority Levels
~/.codex/AGENTS.md → Personal global defaults
repo-root/AGENTS.md → Team shared standards
subdirectory/AGENTS.md → Local rules (highest priority)
Files closer to the working directory take priority. You can have different rules for frontend and backend.
Getting Started
Run /init in Codex CLI — it auto-generates an initial version. Then customize it for your project.
How to Evolve It
A short, precise AGENTS.md is more useful than a long, vague one.
Start with the basics. Only add rules when you notice Codex repeating the same mistake.
Real example: Codex uses any type twice in your TypeScript project. Add this to AGENTS.md:
Fix: Don't use `any` type. Use `unknown` with type guards instead.
Rule of thumb: If you've corrected Codex on the same thing twice, it belongs in AGENTS.md.
Pillar 3: Configuration for Consistency
AGENTS.md tells Codex how your project works. Config tells Codex how it should behave.
Three-layer system:
~/.codex/config.toml → Personal defaults
.codex/config.toml → Repo-specific
CLI flags → One-time overrides
Configure: default model, reasoning level, sandbox mode, approval strategy, MCP servers.
Pillar 4: MCP for External Connections
MCP (Model Context Protocol) connects Codex to databases, APIs, Jira, Notion — anything external.
Golden rule: Start with 1-2 high-value MCP servers. Don't connect a dozen at once.
Pillar 5: Skills — Package Repeatable Workflows
Once a workflow becomes repetitive, stop relying on long prompts. Wrap it in a Skill — a SKILL.md file that works in CLI, IDE, and Codex App.
| Scenario | Skill Name |
|---|---|
| Standard debugging | debug-standard |
| PR review checklist | pr-review |
| Release notes | generate-release-notes |
| Log analysis | analyze-logs |
| Migration planning | migration-plan |
Storage:
-
Personal:
$HOME/.agents/skills -
Team:
.agents/skillsin repo (committable, clone-and-use)
Pillar 6: Automations — Run on Autopilot
Once a workflow is stable, let Codex run it automatically.
Codex App → Automations tab. Configure: project, prompt (can call Skills), frequency, worktree or local.
| Task | Frequency |
|---|---|
| Summarize recent commits | Daily |
| Scan for potential bugs | Weekly |
| Draft release notes | Per release |
| Check CI failures | Per CI run |
| Generate standup summary | Daily |
Skills define the method. Automations define the rhythm.
The 8 Common Mistakes
Mistake 1: Putting Persistent Rules in Prompts
Rewriting conventions in every prompt. Wasteful — forget once, and Codex goes rogue.
✅ Rules go in AGENTS.md. Write once, permanent effect.
Mistake 2: Not Telling Codex How to Build and Test
No verification commands → Codex works blind.
✅ Write build/test commands in AGENTS.md. Include verification in every "Done when."
Mistake 3: Skipping Planning on Complex Tasks
Codex implements first, asks questions never → wrong direction → full revert.
✅ Use Plan mode (
/plan) or ask Codex to explain before implementing.
Mistake 4: Full Access From Day One
Defaulting to complete system access.
✅ Start restricted. Loosen deliberately as you understand the workflow.
Mistake 5: No Worktree Isolation
Multiple sessions modifying the same files → conflicts, confusion.
✅ Each parallel thread in its own git worktree.
Mistake 6: Automating Before Validation
Turning an unstable workflow into an Automation → scaled chaos.
✅ Stabilize as a Skill first. Validate manually. Then automate.
Mistake 7: Micromanaging Every Step
Watching every move defeats Codex's asynchronous advantage.
✅ Launch tasks and walk away. Treat Codex as a background task, not a real-time conversation.
Mistake 8: One Thread Per Project
Accumulating everything in one thread → context bloat → quality decay.
✅ One thread per task. Use
/forkto branch while preserving context.
Cost Optimization: Stretch Your Plus Subscription 6x
For ChatGPT Plus/Pro Users: Credits Matter More Than Dollars
Most people aren't paying API token costs. They're using ChatGPT Plus ($20/mo) or Pro ($200/mo) subscriptions. What matters isn't dollar figures — it's how fast you burn through your credit quota.
Each model consumes credits at a different rate:
| Model | Input Credits (per 1K tokens) | Output Credits | Relative Cost |
|---|---|---|---|
| GPT-5.5 🏆 | 125 | 750 | Baseline (100%) |
| GPT-5.4 ⭐ | 62.5 | 375 | Half of 5.5 |
| GPT-5.4 Mini 🏃 | 18.75 | 113 | 1/3 of 5.4, ~1/6 of 5.5 |
Here's the math that matters: using GPT-5.4 Mini instead of GPT-5.5 means your monthly quota goes 6x further. The same Plus subscription buys you 6 times more work.
The smart strategy:
- Daily simple tasks → GPT-5.4 Mini (reading code, small bugs, tests, questions)
- Medium complexity → GPT-5.4 (refactoring, debugging, features)
- Only high-stakes → GPT-5.5 (architecture review, security audit, complex cross-module work)
For API Users: Dollar Pricing
| Model | Input (per 1M tokens) | Output | Best For |
|---|---|---|---|
| GPT-5.5 | $5.00 | $30.00 | Architecture review, security |
| GPT-5.4 | $2.50 | $15.00 | Complex refactoring, debugging |
| GPT-5.4 Mini | $0.75 | $4.50 | Reading code, simple bugs, tests |
Reasoning Levels
| Level | When to Use |
|---|---|
| Low 🏃 | Quick, well-defined tasks |
| Medium ⭐ | Daily default — best value |
| High 🧠 | Complex changes, debugging |
| Extra High 🚀 | Long-running agentic sessions |
Cache Trick: Almost Free
Reusing the same system prompt and project context triggers the cache — slash costs by up to 90%.
Real numbers (10M tokens processed):
- GPT-5.5 no cache: ~$22.00
- GPT-5.5 high cache: ~$1.25 ← order of magnitude cheaper
- GPT-5.4 Mini high cache: ~$0.24 ← practically free
Stay in the same thread for continuous conversations. Don't create new sessions unnecessarily.
Session Management: One Task, One Thread
Core principle: one thread per task, not one thread per project.
Context bloat is the #1 silent quality killer. The more you accumulate in one thread, the harder Codex finds relevant information.
Key Commands
| Command | Use Case |
|---|---|
/fork |
Branch from current thread when work splits |
/compact |
Compress context when it gets too long |
/plan |
Enter planning mode |
/status |
View session state |
Parallel Work: Good vs Bad Splits
| ✅ Good | ❌ Bad |
|---|---|
| Backend changes + docs update | Multiple agents modifying same files |
| One writes tests, one investigates | Unconfirmed requirements, multiple implementations |
| One implements, one reviews | Schema + callers changed simultaneously |
The Complete Mature Workflow
Setup (One-Time)
① /init → auto-generate AGENTS.md
② Edit AGENTS.md → build commands, constraints, patterns
③ Configure ~/.codex/config.toml → model, reasoning, approval
④ Add 1-2 high-value MCP servers
Daily Tasks
① New thread per task
② 4-element prompt (Goal → Context → Constraints → Done when)
③ Complex task → Plan mode
④ Launch and go work on something else
⑤ Come back, review diff
⑥ Stable pattern → package as Skill
Continuous Improvement
① Same mistake twice → update AGENTS.md
② Skill stable → set up Automation
③ Periodic session review → update config
The Full Development Loop
"Codex shouldn't just generate code. With the right instructions, it can also help test it, check it, and review it."
Don't just ask Codex to write code. Make it complete the loop:
Change code → Write/update tests → Run tests → Check lint/types → Verify behavior → Review diff → Find regressions
Three Things to Remember
If you take nothing else from this guide:
First: AGENTS.md is the single highest-impact thing you can do. Spend 10 minutes writing it, save 30 minutes every day. It turns Codex from a tool with amnesia into a teammate who remembers your project.
Second: Choose your model by task. Daily simple tasks → GPT-5.4 Mini + Low reasoning. Complex work → GPT-5.4/5.5 + High reasoning. Just this one habit can save 60-80% of your quota.
Third: One thread per task. Fork when it splits. Context bloat is the silent killer of Codex output quality.
Progression path: AGENTS.md → Config → Skills → Automations
Skills define the method. Automations define the rhythm. Take it one step at a time.
📖 Haven't read the beginner guide yet? Start here: First Time Using Codex? I Only Asked It to Do 3 Things — from download to first code change.
Based on OpenAI's official Codex documentation and community practices. Pricing as of June 2026 — check openai.com for the latest.

Top comments (0)