HIROKI II

Posted on Jun 16

You're Only Using 30% of Codex — Here's the Other 70%

#agents #codex #devtools #tutorial

10-min read · For developers who already use Codex but feel it could do more
Focus: AGENTS.md • Skills • Automations • Session Management • Cost Optimization

Most people use Codex wrong.

Not because they're doing something obviously incorrect — but because the way Codex works is counterintuitive. You think it's a smarter ChatGPT. It's not. It's a new colleague who walks into your project directory every time with amnesia: knows how to code, but has no idea what your project looks like, what testing framework you use, or which directories are off-limits.

The key to unlocking Codex's full potential isn't better prompts. It's giving it persistent memory of your repo, your conventions, and your workflow.

OpenAI's official Codex documentation distills this into a six-pillar framework and eight common mistakes. I unpacked every single one with real examples. This is the guide I wish I had three months ago.

Before We Start: Understand the Mental Model

Codex doesn't need more instructions. It needs more stable context.

Every time Codex starts working on your project, it reads your files, understands the structure, and gets to work. But without persistent configuration, it forgets everything between sessions. The six pillars are a progressive system to fix that:

AGENTS.md → Config → MCP → Skills → Automations

Each builds on the previous one. You don't need to implement all six on day one.

Pillar 1: Give Context for Every Task

Before every task, make sure Codex knows:

Project structure: Where is source code, tests, config?
Tech stack: Language, framework, build tools
Conventions: Code style, naming, past lessons
State: What's changed, what errors occurred

You can put this in every prompt. But a much better approach is to encode it into Pillar 2.

Pillar 2: AGENTS.md — Your Project's Manual for AI

AGENTS.md is a README written for AI agents. Codex reads it at the start of every session. This is the single highest-impact thing you can do.

What Goes In It

Category	Example
Directory structure	`src/` is source, `db/schema/` is off-limits
Start commands	`pnpm dev`, `docker compose up`
Test/build/lint	`pnpm test`, `pnpm build`, `pnpm lint`
Engineering rules	No `any` in TypeScript
Constraints	Don't modify `db/schema/`
Acceptance criteria	"Pass `pnpm test` to be done"

Three Priority Levels

~/.codex/AGENTS.md         → Personal global defaults
repo-root/AGENTS.md        → Team shared standards
subdirectory/AGENTS.md     → Local rules (highest priority)

Files closer to the working directory take priority. You can have different rules for frontend and backend.

Getting Started

Run /init in Codex CLI — it auto-generates an initial version. Then customize it for your project.

How to Evolve It

A short, precise AGENTS.md is more useful than a long, vague one.

Start with the basics. Only add rules when you notice Codex repeating the same mistake.

Real example: Codex uses any type twice in your TypeScript project. Add this to AGENTS.md:

Fix: Don't use `any` type. Use `unknown` with type guards instead.

Rule of thumb: If you've corrected Codex on the same thing twice, it belongs in AGENTS.md.

Pillar 3: Configuration for Consistency

AGENTS.md tells Codex how your project works. Config tells Codex how it should behave.

Three-layer system:

~/.codex/config.toml       → Personal defaults
.codex/config.toml         → Repo-specific
CLI flags                  → One-time overrides

Configure: default model, reasoning level, sandbox mode, approval strategy, MCP servers.

Pillar 4: MCP for External Connections

MCP (Model Context Protocol) connects Codex to databases, APIs, Jira, Notion — anything external.

Golden rule: Start with 1-2 high-value MCP servers. Don't connect a dozen at once.

Pillar 5: Skills — Package Repeatable Workflows

Once a workflow becomes repetitive, stop relying on long prompts. Wrap it in a Skill — a SKILL.md file that works in CLI, IDE, and Codex App.

Scenario	Skill Name
Standard debugging	`debug-standard`
PR review checklist	`pr-review`
Release notes	`generate-release-notes`
Log analysis	`analyze-logs`
Migration planning	`migration-plan`

Storage:

Personal: $HOME/.agents/skills
Team: .agents/skills in repo (committable, clone-and-use)

Pillar 6: Automations — Run on Autopilot

Once a workflow is stable, let Codex run it automatically.

Codex App → Automations tab. Configure: project, prompt (can call Skills), frequency, worktree or local.

Task	Frequency
Summarize recent commits	Daily
Scan for potential bugs	Weekly
Draft release notes	Per release
Check CI failures	Per CI run
Generate standup summary	Daily

Skills define the method. Automations define the rhythm.

The 8 Common Mistakes

Mistake 1: Putting Persistent Rules in Prompts

Rewriting conventions in every prompt. Wasteful — forget once, and Codex goes rogue.

✅ Rules go in AGENTS.md. Write once, permanent effect.

Mistake 2: Not Telling Codex How to Build and Test

No verification commands → Codex works blind.

✅ Write build/test commands in AGENTS.md. Include verification in every "Done when."

Mistake 3: Skipping Planning on Complex Tasks

Codex implements first, asks questions never → wrong direction → full revert.

✅ Use Plan mode (/plan) or ask Codex to explain before implementing.

Mistake 4: Full Access From Day One

Defaulting to complete system access.

✅ Start restricted. Loosen deliberately as you understand the workflow.

Mistake 5: No Worktree Isolation

Multiple sessions modifying the same files → conflicts, confusion.

✅ Each parallel thread in its own git worktree.

Mistake 6: Automating Before Validation

Turning an unstable workflow into an Automation → scaled chaos.

✅ Stabilize as a Skill first. Validate manually. Then automate.

Mistake 7: Micromanaging Every Step

Watching every move defeats Codex's asynchronous advantage.

✅ Launch tasks and walk away. Treat Codex as a background task, not a real-time conversation.

Mistake 8: One Thread Per Project

Accumulating everything in one thread → context bloat → quality decay.

✅ One thread per task. Use /fork to branch while preserving context.

Cost Optimization: Stretch Your Plus Subscription 6x

For ChatGPT Plus/Pro Users: Credits Matter More Than Dollars

Most people aren't paying API token costs. They're using ChatGPT Plus ($20/mo) or Pro ($200/mo) subscriptions. What matters isn't dollar figures — it's how fast you burn through your credit quota.

Each model consumes credits at a different rate:

Model	Input Credits (per 1K tokens)	Output Credits	Relative Cost
GPT-5.5 🏆	125	750	Baseline (100%)
GPT-5.4 ⭐	62.5	375	Half of 5.5
GPT-5.4 Mini 🏃	18.75	113	1/3 of 5.4, ~1/6 of 5.5

Here's the math that matters: using GPT-5.4 Mini instead of GPT-5.5 means your monthly quota goes 6x further. The same Plus subscription buys you 6 times more work.

The smart strategy:

Daily simple tasks → GPT-5.4 Mini (reading code, small bugs, tests, questions)
Medium complexity → GPT-5.4 (refactoring, debugging, features)
Only high-stakes → GPT-5.5 (architecture review, security audit, complex cross-module work)

For API Users: Dollar Pricing

Model	Input (per 1M tokens)	Output	Best For
GPT-5.5	$5.00	$30.00	Architecture review, security
GPT-5.4	$2.50	$15.00	Complex refactoring, debugging
GPT-5.4 Mini	$0.75	$4.50	Reading code, simple bugs, tests

Reasoning Levels

Level	When to Use
Low 🏃	Quick, well-defined tasks
Medium ⭐	Daily default — best value
High 🧠	Complex changes, debugging
Extra High 🚀	Long-running agentic sessions

Cache Trick: Almost Free

Reusing the same system prompt and project context triggers the cache — slash costs by up to 90%.

Real numbers (10M tokens processed):

GPT-5.5 no cache: ~$22.00
GPT-5.5 high cache: ~$1.25 ← order of magnitude cheaper
GPT-5.4 Mini high cache: ~$0.24 ← practically free

Stay in the same thread for continuous conversations. Don't create new sessions unnecessarily.

Session Management: One Task, One Thread

Core principle: one thread per task, not one thread per project.

Context bloat is the #1 silent quality killer. The more you accumulate in one thread, the harder Codex finds relevant information.

Key Commands

Command	Use Case
`/fork`	Branch from current thread when work splits
`/compact`	Compress context when it gets too long
`/plan`	Enter planning mode
`/status`	View session state

Parallel Work: Good vs Bad Splits

✅ Good	❌ Bad
Backend changes + docs update	Multiple agents modifying same files
One writes tests, one investigates	Unconfirmed requirements, multiple implementations
One implements, one reviews	Schema + callers changed simultaneously

The Complete Mature Workflow

Setup (One-Time)

① /init → auto-generate AGENTS.md
② Edit AGENTS.md → build commands, constraints, patterns
③ Configure ~/.codex/config.toml → model, reasoning, approval
④ Add 1-2 high-value MCP servers

Daily Tasks

① New thread per task
② 4-element prompt (Goal → Context → Constraints → Done when)
③ Complex task → Plan mode
④ Launch and go work on something else
⑤ Come back, review diff
⑥ Stable pattern → package as Skill

Continuous Improvement

① Same mistake twice → update AGENTS.md
② Skill stable → set up Automation
③ Periodic session review → update config

The Full Development Loop

"Codex shouldn't just generate code. With the right instructions, it can also help test it, check it, and review it."

Don't just ask Codex to write code. Make it complete the loop:

Change code → Write/update tests → Run tests → Check lint/types → Verify behavior → Review diff → Find regressions

Three Things to Remember

If you take nothing else from this guide:

First: AGENTS.md is the single highest-impact thing you can do. Spend 10 minutes writing it, save 30 minutes every day. It turns Codex from a tool with amnesia into a teammate who remembers your project.

Second: Choose your model by task. Daily simple tasks → GPT-5.4 Mini + Low reasoning. Complex work → GPT-5.4/5.5 + High reasoning. Just this one habit can save 60-80% of your quota.

Third: One thread per task. Fork when it splits. Context bloat is the silent killer of Codex output quality.

Progression path: AGENTS.md → Config → Skills → Automations

Skills define the method. Automations define the rhythm. Take it one step at a time.

📖 Haven't read the beginner guide yet? Start here: First Time Using Codex? I Only Asked It to Do 3 Things — from download to first code change.

Based on OpenAI's official Codex documentation and community practices. Pricing as of June 2026 — check openai.com for the latest.

DEV Community