Hunter G

Posted on May 28

Codex and Claude Code's /goal Command in Practice

#claudecode #codex #ai #agents

/goal is a new command that OpenAI Codex CLI (April 30) and Anthropic Claude Code (May 12) shipped within 11 days of each other.

The idea is simple. You give it a completion condition, and it keeps running on its own until that condition is met.

Before /goal, AI coding agents stopped after every turn and waited for you to hit Enter. Even if your prompt said "keep going until X," it would still pause. /goal is the fix.

How the two implementations differ

Codex stores the task locally. Close your laptop, reboot — the task persists and you resume with /goal resume. Controls: create, pause, resume, clear. Persisted workflow on the app-server.

Claude Code takes another route: a cheaper small model (Haiku, by default) acts as a supervisor. After each turn, the supervisor reads the transcript and answers one question: "is the goal met?" If no, keep going. If yes, stop and hand back. Token-wise, the supervisor is billed separately and doesn't eat into the main model's budget.

Two paths to the same problem — let the agent decide whether it's done.

Three before/after scenarios

1. Running a data scrape

Before: You ask it to scrape products from three brands. Finishes brand 1, stops, asks "OK to continue?" You hit continue. Finishes brand 2, stops again. Brand 3 ends at 92% success and you have to manually retry the failures. You spent the whole evening hitting Enter.

After: You say "Scrape all three brands. Auto-retry failures 3 times. Below 95% doesn't count." Close laptop, go eat. Come back: first pass 92%, it retried once on its own, hit 96%, done.

The difference is "95% counts as done." /goal turns that into the agent's call instead of yours.

2. Packaging a Mac app

Before: Build fails, Google, find a 2014 Stack Overflow thread, try it, new error, Google again, try again. 20+ rounds. It's 2 AM and you're still at the keyboard.

After: "Get the build script to produce a shippable package. On failure, read the error yourself, search for fixes, retry until it ships." Close laptop, sleep. Morning: 4 rounds, each diagnosing and fixing a different problem. Package built.

/goal swallows the "read error → search → patch → rerun" human loop.

3. Server down during a business trip

Before: Server crashes mid-trip. Phone can't connect, you can't push fixes, you refresh the monitoring page every 10 minutes hoping it self-healed.

After: Open the laptop and say "Remote into the server, figure out why it crashed, fix it, confirm the endpoint is back." Continue the trip. Get home, already done.

It logged in, read the logs, found the culprit file, deleted it, restarted, verified. Left a note: "Don't let this kind of file slip in again."

This used to be human-only work.

Why this matters

Engineering-wise, /goal isn't hard. Codex is a state machine plus storage. Claude Code is a secondary LLM call.

But it solves a problem that none of the AI coding agents fixed in the last two years:

The main model can't tell if it's done.

It's zoomed in writing code. "Done" needs zoom out. Anthropic's call — let Haiku zoom out — is a beautiful division of labor.

Same week, Anthropic also shipped /loop, /batch, /background. /loop runs N times. /batch parallelizes tasks. /background runs in the background. All three still leave "when to stop" up to you. Only /goal hands that to the agent.

The transfer of the stop bit is the real paradigm shift.

What developers should think about

/goal moves your effort from micromanaging the process to defining the goal.

Four things to write into a goal:

Stop condition — what counts as "done"
Verification — how to prove it's done
Untouchable boundaries — what not to change
Success metric — quantify it (e.g. ≥95% success)

A vague prompt wastes a turn. A vague goal can waste 6 hours.

But a sharp goal lets the agent run for 8 hours without your input.

DEV Community

Codex and Claude Code's /goal Command in Practice

How the two implementations differ

Three before/after scenarios

Why this matters

What developers should think about

Top comments (0)