Hassann

Posted on May 14 • Originally published at apidog.com

The /goal Command: How to Run Codex and Claude Code as 24/7 Autonomous Agents

Every major AI lab recently shipped the same primitive: /goal. Anthropic added it to Claude Code, OpenAI shipped it in Codex CLI and the Codex desktop app, and Nous Research wired it into Hermes. The naming is deliberate: the industry is converging on a shared interface for agents that run in a closed loop until a measurable end state is reached, without asking for approval at every step.

Try Apidog today

If you have been doing the manual “approve, send prompt, tell the agent to continue, repeat” workflow, /goal is the slash command that removes the babysitting. You give the agent a target, it works against that target, and it returns when the target is reached or when it hits a constraint.

This guide is for developers and API builders. You will learn what /goal does under the hood, how to set it up in Codex and Claude Code, how to write goal prompts that actually converge, and how to connect the workflow to API development with Apidog.

You can download Apidog for free if you want to follow along with the API examples.

What `/goal` actually does

In one line:

/goal lets an AI agent loop on a task until a stop condition is met.

Instead of stopping after one response, the agent repeatedly:

Plans the next step.
Executes the step.
Checks whether the goal has been met.
Continues or stops based on that check.

The important part is the validator. A smaller, faster model evaluates the agent’s work after each step and answers:

Has the goal been met?

If the answer is no, the main model continues. If yes, the loop closes and the agent reports back.

This is the same pattern popularized by early “Ralph loop” workflows, but now it is built into official tools as a first-class command.

The difference from normal agent usage is simple:

Without /goal: you are the loop. You read output, decide whether it is correct, prompt the next step, approve tool calls, and repeat.
With /goal: the agent owns the loop. It plans, executes, validates, and only surfaces when it finishes, hits a constraint, or runs out of budget.

Example:

/goal create a landing page until it builds successfully and renders without console errors

That goal can trigger scaffolding, styling, debugging, browser verification, and a final summary in one continuous run.

Why `/goal` is showing up everywhere

Long-running agent tasks usually fail in two ways:

Drift

The model starts with the right task, then gradually wanders away from the original target.
Babysitting

The model can do the work, but the user still has to supervise every iteration.

A validator model helps with both. It gives the loop a stop condition and continuously checks the work against the original target.

The implementation is relatively simple:

main model -> action -> validator -> continue or stop

That pattern is cheap enough to run because the validator can be a smaller model with a narrow prompt. Once vendors saw the pattern working, they shipped it under the same command name.

Setting up `/goal` in Codex

The Codex CLI gives you the most control.

1. Enable goals in Codex desktop

Open Codex desktop and go to:

Settings → Configuration

Set:

goals = true

The CLI inherits this configuration.

2. Launch Codex in full-auto mode

Use full-auto mode if you want to avoid approval prompts during the loop:

codex --approval-mode full-auto

3. Set a goal

Inside the Codex session, run:

/goal [your goal here]

Example:

/goal fix every failing test until npm test exits 0 without modifying files outside the /auth directory

Codex will confirm that the goal is registered, then start running.

If you do not want to work in the terminal, use the Codex desktop app. The functionality is similar, but the UI makes it easier to pause, clear goals, and monitor token usage.

Setting up `/goal` in Claude Code

Claude Code works similarly.

Launch the CLI and run:

/goal [task description]

Example:

/goal update the checkout API tests until every payment scenario passes without changing production credentials

The official docs are available on the Claude Code documentation site.

If Claude Code fails during setup, this guide to fixing the invalid custom3p enterprise config error covers a common failure mode.

For multi-agent workflows alongside /goal, see this breakdown of Ruflo, a multi-agent layer on top of Claude Code.

One practical tip: Claude Code shows a live token count and progress bar while a goal is running. Watch the token count. If the agent burns tokens without visible progress, the validator may not be converging. In that case, use:

/pause

or:

/goal clear

Then rewrite the goal with clearer success criteria.

The prompt structure that works

The syntax is easy. The hard part is writing a goal that can be validated.

A useful /goal prompt has three parts:

Work: what should be done.
Measurable end state: what “done” means.
Constraints: rules the agent must not violate.

Use this structure:

/goal [do the work] until [measurable end state] without [constraints]

Example:

/goal fix every failing test until npm test exits 0 without modifying any file outside the /auth directory

Why this works:

npm test exits 0 is measurable.
without modifying any file outside /auth is enforceable.
The agent cannot claim success unless the validator can verify it.

Avoid vague goals like:

/goal make this UI feel modern

That has no reliable stop condition.

Rewrite it as:

/goal improve the dashboard UI until Lighthouse accessibility score is 90+ and all existing Playwright visual checks pass

That gives the validator something concrete to check.

A stronger format for long-running goals

For larger tasks, use a block format:

/goal
Objective: [one-line goal]

Success criteria:
  - [measurable criterion 1]
  - [measurable criterion 2]
  - [measurable criterion 3]

Constraints:
  - [boundary 1]
  - [boundary 2]

Context:
  - [files, repos, commands, API specs, or keys the agent should know about]

Example:

/goal
Objective: Implement the password reset API flow.

Success criteria:
  - POST /auth/password-reset returns 200 for a valid registered email.
  - Invalid emails return the documented error schema.
  - npm test exits 0.
  - The OpenAPI spec remains valid.

Constraints:
  - Do not modify files outside /auth and /tests/auth.
  - Do not log tokens or reset links.
  - Do not call production email services.

Context:
  - OpenAPI spec: ./openapi.yaml
  - Test command: npm test -- auth
  - Mock email service is available in ./tests/mocks/email.ts

This format works because each loop iteration can be checked against explicit criteria.

`/goal` examples you can reuse

Research

/goal collect every public benchmark for Claude Opus 4.7 published since April 2026, save sources, and produce a markdown table sorted by date until the table covers at least 10 distinct benchmarks

Repo maintenance

/goal find dead code, unused dependencies, and stale files in this repo, then propose a PR description listing safe removals until every item has a justification

Documentation

/goal rewrite README.md so a new contributor can install, run, test, and understand the project until each of those four steps has a working command and an expected output

Feature work

/goal add a dark/light theme toggle, persist the choice in localStorage, update styles for both themes, and verify in the browser until the toggle works without a page reload and survives a refresh

The pattern is the same in every example: define a verifiable end state.

Pairing `/goal` with API development

API work is a strong fit for /goal because “done” is usually testable.

For an endpoint, completion can mean:

The request returns the expected status code.
The response schema matches the contract.
Required error cases are covered.
The OpenAPI document is valid.
The test suite passes.

That gives the validator a concrete signal.

A practical workflow looks like this:

1. Design the contract first

Define the endpoint, request schema, response schema, and example payloads in Apidog.

This contract becomes the source of truth.

2. Export the API spec

Export the OpenAPI 3.x spec from Apidog and add it to your repo, for example:

./openapi.yaml

Give this file to Codex or Claude Code as context.

3. Run `/goal`

Use a goal that points at the contract and tests:

/goal implement POST /users from ./openapi.yaml until every Apidog test case passes and npm test exits 0 without changing the OpenAPI contract

4. Let the validator run the test command

Each iteration should run the API test suite against the local service. The agent should only finish when the tests are green.

This is better than letting the agent invent tests after the fact. The contract already exists, so the agent has to implement against it instead of redefining success.

If you are new to this workflow, start with the design-first API workflow guide. For test structure, see the API testing tool overview for QA engineers.

If you are working with MCP servers, the same idea applies. See MCP server testing with Apidog for a setup that lets /goal agents safely run against a local MCP server.

Production tips for `/goal`

After using /goal on real work, a few practices help.

Use one goal at a time

Codex and Claude Code are designed around a single active goal.

Before starting a new one, clear the current goal:

/goal clear

Plan before running

Use a planning step first:

/plan implement the password reset API based on ./openapi.yaml

Review the plan, then turn it into a goal:

/goal implement the reviewed password reset plan until all auth tests pass without modifying files outside /auth

This reduces wasted iterations because the agent is less likely to redesign the solution mid-loop.

Use a scratchpad file

Ask the agent to maintain a progress file:

Maintain progress.md with:
- completed steps
- current blocker
- next action
- test results

This gives you an audit trail and gives the agent persistent context across iterations.

Let the model write the goal

If your initial idea is rough, ask the model to convert it into a good /goal prompt:

Turn this task into a /goal command with measurable success criteria and strict constraints:
[task]

This often produces better stop conditions than writing the goal manually.

Watch the validator signal

If the loop does not close, do not keep retrying the same goal. Tighten the success criteria.

Bad:

until the code is good

Better:

until npm test exits 0, npm run lint exits 0, and the response matches ./openapi.yaml

Use `/goal` for long-horizon work

For a one-line refactor, a normal prompt is usually faster. /goal has overhead because it runs a loop and validation steps.

Use it when the task needs multiple iterations.

When `/goal` will let you down

/goal is useful, but it is not magic.

Cost

Long-running loops burn tokens. The validator may be cheaper than the main model, but the main model is still doing the work.

Set a budget, monitor usage, and pause when needed.

Tasks with weak validation

Some tasks do not have clean stop conditions:

UX polish
visual taste
prose tone
“make it better”
subjective design feedback

For these, use normal prompts or define a measurable proxy.

External side effects

Be careful with goals that can trigger real-world actions:

sending email
charging cards
calling production APIs
deleting data
modifying customer records

Add hard constraints:

without calling production APIs
without sending real emails
without using live payment credentials
without deleting user data

If you are designing access control around AI agents calling APIs, this GitHub Copilot usage and billing API for teams writeup shows how major vendors approach usage and control.

Stale context

Long-running goals can drift if the repo changes while the agent is working. If the codebase changes mid-loop, pause and restart with fresh context.

What this changes for AI-assisted development

/goal shifts the developer workflow from:

write every instruction manually

to:

define success criteria, constraints, and validation commands

The command itself is small, but the workflow change is large.

Teams that benefit most already have:

clear specs
strong CI
executable tests
API contracts
deterministic validation commands

If your API has an OpenAPI document and a test suite, a /goal agent can work against a real target. If the API only exists in someone’s head, the agent has no reliable validator.

That is why API platforms become important in AI workflows. Apidog supports design-first API development, which gives agents a contract to implement against and tests to run.

If you want to try the workflow, download Apidog and set up a contract-first API project before running /goal.

FAQ

Does `/goal` work in the Codex web app?

Yes. It works in Codex CLI, Codex desktop, the Codex app, and Claude Code CLI. Hermes also supports the same command.

How is `/goal` different from a regular prompt?

A regular prompt runs once and stops.

/goal runs in a loop. A validator model checks the stop condition after every step, and the agent decides whether to continue or finish.

Can the agent break out of the constraints I set?

The validator checks constraints on each iteration, but constraint quality matters.

Use explicit constraints:

without modifying any file outside /auth

Avoid vague constraints:

without breaking anything

The first is enforceable. The second is not.

Will `/goal` cost more than a normal Claude or Codex session?

Usually, yes. The loop runs multiple iterations, and the main model continues working until the goal is met or stopped.

Use budgets, token monitoring, and /pause to control spend.

What if I want to test the agent’s output against a real API?

Use a tool like Apidog to define the API contract and run real test cases against the implementation.

The validator can call the test runner or Apidog CLI, which gives /goal a measurable end state.

If you are bootstrapping a Claude-powered service on a limited budget, see the free Claude API guide.

What /goal actually does

Why /goal is showing up everywhere

Setting up /goal in Codex

1. Enable goals in Codex desktop

2. Launch Codex in full-auto mode

3. Set a goal

Setting up /goal in Claude Code

The prompt structure that works

A stronger format for long-running goals

/goal examples you can reuse

Research

Repo maintenance

Documentation

Feature work

Pairing /goal with API development

1. Design the contract first

2. Export the API spec

3. Run /goal

4. Let the validator run the test command

Production tips for /goal

Use one goal at a time

Plan before running

Use a scratchpad file

Let the model write the goal

Watch the validator signal

Use /goal for long-horizon work

When /goal will let you down

Cost

Tasks with weak validation

External side effects

Stale context

What this changes for AI-assisted development

FAQ

Does /goal work in the Codex web app?

How is /goal different from a regular prompt?

Can the agent break out of the constraints I set?

Will /goal cost more than a normal Claude or Codex session?

What if I want to test the agent’s output against a real API?

What `/goal` actually does

Why `/goal` is showing up everywhere

Setting up `/goal` in Codex

Setting up `/goal` in Claude Code

`/goal` examples you can reuse

Pairing `/goal` with API development

3. Run `/goal`

Production tips for `/goal`

Use `/goal` for long-horizon work

When `/goal` will let you down

Does `/goal` work in the Codex web app?

How is `/goal` different from a regular prompt?

Will `/goal` cost more than a normal Claude or Codex session?