DEV Community: Ramsis Hammadi

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

Ramsis Hammadi — Sat, 16 May 2026 10:30:00 +0000

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

TL;DR Summary

The OpenAI Agents SDK now includes sandbox execution — agents run code, access files, and use shell commands in isolated container-based workspaces
A model-native harness replaces custom orchestration code: the SDK handles tool dispatch, state persistence, and multi-step workflows
Sandboxes support filesystem, shell, package installs, Git repos, mounted storage (S3/GCS/R2), exposed ports, snapshots, and resumable state
The agent and sandbox are deliberately separate — harness owns the control plane (model calls, tool routing, approvals), sandbox owns execution (files, commands)
Deploy on Unix-local (dev), Docker (local container), or hosted providers (Cloudflare, Vercel) with the same agent definition

Direct Answer Block

The OpenAI Agents SDK is a code-first framework for building production AI agents in TypeScript or Python. Its sandbox feature gives agents an isolated Unix-like workspace with filesystem, shell, mounted data, and resumable state. The model-native harness handles tool dispatch, multi-step execution, and state persistence — replacing the custom orchestration code you'd otherwise write yourself.

Introduction

Before the Agents SDK's sandbox update, building a production AI agent that could safely execute code required stitching together: a model API client, a container runtime, credential isolation, state persistence, tool routing, and approval logic. Each piece was custom code. The SDK collapses that stack: define your agent with a manifest describing the workspace, attach capabilities (shell, filesystem, skills, memory), and pick a sandbox client. The harness handles everything between model turns.

What is the OpenAI Agents SDK's "model-native harness" and how does it change agent development?

The model-native harness is a runtime layer that matches how models naturally use tools and context. According to the newsletter reporting OpenAI's announcement, it "runs agents in a way that matches how models naturally use tools and context."

In practice, this means the harness owns:

Tool dispatch: when the model calls shell or file_read, the harness routes the call to the correct sandbox tool
State persistence: conversation state, tool results, and workspace state survive across model turns
Multi-step execution: the agent loop continues across turns, with each step observable and cancellable
Streaming: responses stream back to the application as the agent works
Recovery: if a sandbox session stops, the harness can resume from serialized state

The pre-harness approach required developers to write this orchestration themselves — wrapping every tool call, managing conversation state, handling tool errors, and building resumption logic. The harness replaces that with a structured runtime.

OpenAI's Agents SDK documentation positions it as the code-first path: "use the SDK track when your server owns orchestration, tool execution, state, and approvals." For hosted workflow creation without code, use Agent Builder. For direct model API access, use the client libraries.

The SDK separates agent definitions from execution boundaries. A SandboxAgent is still an Agent — it keeps instructions, prompt, tools, handoffs, MCP servers, model settings, and hooks. What changes is where execution happens: a live sandbox session with its own filesystem, commands, and ports.

How does sandbox execution work — and how does it keep agent code safe in production?

The sandbox is an isolated, Unix-like execution environment with filesystem, shell, installed packages, mounted data, exposed ports, and resumable state. The key architectural decision: the agent harness and sandbox compute are separate.

"The key split is the boundary between the harness and compute. The harness is the control plane around the model: it owns the agent loop, model calls, tool routing, handoffs, approvals, tracing, recovery, and run state. Compute is the sandbox execution plane where model-directed work reads and writes files, runs commands, installs dependencies, uses mounted storage, exposes ports, and snapshots state." — OpenAI Sandbox Agents documentation

This separation matters for production safety:

Control plane stays in trusted infrastructure — the harness keeps auth, billing, audit logs, human review, and recovery state outside any single container
Sandbox is an execution environment, not the control plane — it runs commands and edits files but doesn't own model decisions
Credentials isolate from agent code — sandbox credentials are runtime configuration, not prompt content. OpenAI's docs explicitly warn: "Treat sandbox credentials as runtime configuration, not prompt content."

The difference between running the harness inside the sandbox vs separate from it is a product decision. Inside-sandbox is convenient for prototypes. Separate-sandbox is the production pattern — the harness keeps sensitive control plane operations in your infrastructure while sandboxes handle provider-specific execution.

According to the newsletter, the SDK "keeps credentials outside execution environments where model-generated code runs" — a critical security boundary when agents can generate and execute arbitrary code.

Sandbox clients

Client	Use case
UnixLocal	Local development on macOS/Linux. Creates temp workspace, cleans up after run
Docker	Local container isolation with custom images
Hosted providers	Cloudflare, Vercel — production deployment with provider-specific isolation

The sandbox client is part of run configuration, not agent definition. Keep the agent, manifest, and capabilities stable, then swap the client per environment.

What file system tools, MCP integration, and storage systems does the SDK support?

File system tools

The SDK provides file system primitives that the agent uses to interact with workspace files:

File reads and writes — read project directories, edit source files, create new files
Apply patch — apply diffs to workspace files
View image — inspect local images in the sandbox
Shell commands — execute arbitrary commands with interactive input support

MCP integration

MCP (Model Context Protocol) enables structured tool use for external APIs and services. According to the newsletter, "MCP enables structured tool use for external APIs and services."

MCP servers connect through the SDK's integration layer, allowing agents to use tools from:

Communication (Slack, Discord)
Project management (Linear, Jira)
Data sources (databases, Google Drive)
Custom APIs (your internal services)

Storage systems

The manifest supports mounting external storage directly into the sandbox:

Mount type	Use case
S3 Mount	Data room files, generated artifacts
GCS Mount	Google Cloud Storage datasets
R2 Mount	Cloudflare storage
Azure Blob	Azure data
Box Mount	Box cloud storage
S3 Files Mount	Individual files from S3

OpenAI's docs recommend: "Keep mounted storage scoped to the inputs the agent should read or write. Treat mount entries as ephemeral workspace entries."

Manifest

The manifest describes the workspace contract for a fresh sandbox session — files, repos, input artifacts, output directories, environment variables, and OS users/groups. It's treated as a starting-point contract, not the full source of truth.

How do you define an agent manifest with inputs, outputs, directory structure, and provider config?

A manifest defines what the agent sees when a sandbox session starts. Here's a practical example from OpenAI's sandbox quickstart:

TypeScript:

const manifest = new Manifest({
  entries: {
    "account_brief.md": file({
      content: "# Northwind Health\n" +
        "- Segment: Mid-market healthcare analytics provider.\n" +
        "- Renewal date: 2026-04-15.\n",
    }),
    "implementation_risks.md": file({
      content: "# Delivery risks\n" +
        "- Security questionnaire is not complete.\n" +
        "- Procurement requires final legal language by April 1.\n",
    }),
  },
});

Python:

manifest = Manifest(
    entries={
        "account_brief.md": File(
            content=b"# Northwind Health\n...\n"
        ),
        "implementation_risks.md": File(
            content=b"# Delivery risks\n...\n"
        ),
    }
)

Manifest inputs cover:

Input type	What it provides
`File` / `Dir`	Synthetic inputs, helper files, output directories
Local file/directory	Host files materialized into sandbox
Git repo	Repository cloned into workspace
Storage mounts	S3, GCS, R2, Azure Blob, Box
`environment`	Startup environment variables
`users` / `groups`	Sandbox-local OS accounts

Design rules from OpenAI's docs:

Put repos, input artifacts, and output directories in the manifest
Put task specs and instructions in workspace files (repo/task.md, AGENTS.md)
Use relative workspace paths in instructions
Keep mounts scoped to inputs the agent should use
Avoid saving secrets, tokens, or sensitive files in the manifest

How does credential isolation work across Cloudflare, Vercel, and custom deployment environments?

Credential isolation is a first-class design concern in the sandbox architecture. The principle: credentials are runtime configuration, not prompt content.

OpenAI's sandbox docs specify three rules:

Prefer provider-native secret systems for hosted sandbox providers
Keep cloud storage credentials scoped to the specific mount or provider option
Use Manifest.environment for startup values, marking sensitive entries as ephemeral

According to the newsletter, the SDK "keeps credentials outside execution environments where model-generated code runs." This means:

The agent prompt never contains API keys, tokens, or secrets
Sandbox environment variables are injected by the provider, not by the model
Cloud provider deployments (Cloudflare Workers, Vercel Functions) isolate credentials from sandbox compute

The provider is part of run configuration, not agent definition. The same agent with the same manifest can run on UnixLocal for development, Docker for local container testing, and a hosted provider for production — credentials are configured per provider, per environment.

OpenAI's documentation warns: "Review artifacts before moving them out of the sandbox, especially when the agent can read private documents or mounted storage." The sandbox can access mounted data — your application should verify what comes out.

How do you orchestrate multi-agent workflows with handoffs, guardrails, and human-in-the-loop approvals?

The Agents SDK includes orchestration primitives that layer on top of the sandbox foundation:

Handoffs

When a task requires multiple specialists, handoffs transfer control between agents. Each agent owns its domain. The harness routes based on the handoff target.

Guardrails

Guardrails run before or after model turns to validate output or block unsafe actions. According to the SDK docs, guardrails and human review "block or pause before risky work continues."

Human-in-the-loop

For high-risk operations, the workflow pauses for human approval. The sandbox state persists during the pause — when approved, the agent continues in the same workspace with the same files and context.

Capabilities

Each sandbox agent gets capabilities attached to its definition:

Capability	What it adds
Shell	Command execution with interactive input
Filesystem	File edits (apply_patch) and image viewing
Skills	Skill discovery and materialization from local dirs or Git repos
Memory	Persist memory artifacts across runs (requires Shell + Filesystem)
Compaction	Context trimming for long-running flows

By default, a SandboxAgent includes filesystem, shell, and compaction. If you pass a custom capabilities list, it replaces the defaults — include them explicitly if needed.

Advanced patterns (from OpenAI's examples)

Data room Q&A: Answer questions over mounted documents
Repository code review: Clone a repo, inspect it, produce review artifacts
Vision website clone: Clone a website using Vision API and screenshot feedback
Sandbox resume: Resume work in a pre-existing sandbox session

Frequently Asked Questions

Q: Do I need a sandbox for every agent?

No. If your agent only needs model responses without files, commands, or persistent state, use the Responses API directly or the basic Agents SDK runtime. Sandboxes are for when the answer depends on workspace work.

Q: Can I use the Agents SDK with non-OpenAI models?

The SDK supports provider configuration, allowing different model providers per agent. Sandbox execution is independent of model choice — the harness handles tool routing regardless of which model generates the tool calls.

Q: How much do sandbox runs cost?

Sandbox pricing depends on the provider (UnixLocal is free, hosted providers bill per session). OpenAI's API usage is separate from sandbox compute costs. Check provider-specific pricing.

Q: Can sandbox state survive between runs?

Yes. Three persistence levels: RunState (harness-side state), serialized session state (reconnect to same sandbox), and snapshots (save workspace contents to seed a fresh session). Use snapshots to skip dependency installation on subsequent runs.

Q: Is sandbox execution available in both TypeScript and Python SDKs?

Yes. Both SDKs support the same sandbox primitives with language-idiomatic APIs. Official examples exist for both.

Q: How does this differ from Claude Code's sandbox approach?

Both separate agent from execution, but OpenAI's SDK is a code-first framework you integrate into your application, while Claude Code is a product you run. OpenAI's approach gives you programmatic control over the harness, manifests, and provider selection.

Glossary

Model-native harness: The SDK runtime layer that handles tool dispatch, state persistence, and multi-step execution in a way that matches model behavior
Sandbox: An isolated, Unix-like execution environment with filesystem, shell, packages, mounts, ports, and resumable state
Manifest: The workspace contract describing what files, repos, mounts, and environment variables a fresh sandbox session starts with
Capabilities: Sandbox-native behaviors attached to an agent (shell, filesystem, skills, memory, compaction)
Handoff: Transfer of control between specialized agents within a multi-agent workflow
Snapshot: A saved workspace state used to seed a fresh sandbox session, skipping redundant setup

Author

Ramsis Hammadi — AI/ML engineer specializing in GenAI, LLM engineering, and automation. Full bio →

CLAUDE.md Rules: How to Cut AI Coding Mistakes from 40% to 3% in 2026

Ramsis Hammadi — Fri, 15 May 2026 06:21:00 +0000

CLAUDE.md Rules: How to Cut AI Coding Mistakes from 40% to 3% in 2026

TL;DR Summary

Andrej Karpathy's original 4-rule CLAUDE.md cut Claude coding errors from ~40% to ~11% by enforcing clarification, simplicity, surgical scope, and verification
The 12-rule extension (claude-code-pro-pack) adds 8 more rules targeting agent-orchestration failures and pushes error rates to ~3% — a ~10x improvement over no rules
Two leading open-source implementations exist: the 12-Rule Pro Pack (~700 tokens, 5 skill templates, Karpathy-provenance) and Ten Commandments for Coding Agents (~400 tokens, portable across all agents.md tools)
The key insight: past ~200 lines of CLAUDE.md, compliance drops sharply — rules get buried. 12 rules with minimal boilerplate is the sweet spot
These are drop-in files. Copy one into your project root. The agent picks it up on the next run. No framework, no config.

Direct Answer Block

CLAUDE.md is a markdown file in your project root that AI coding agents read at session start. Karpathy's original 4 rules addressed the highest-frequency failure modes: silent assumptions, overbuilt code, unintended edits, and unverified claims. The 12-rule extension layers agent-orchestration safeguards: token budget limits to stop debugging spirals, conflict-surfacing to prevent "averaging" two codebase patterns, and read-before-write to block uninformed edits. Together they form a behavioral contract between you and the AI agent — and the data says it works.

Introduction

You've experienced it: you ask an AI coding agent to fix a one-line bug, and it rewrites three functions, reformats adjacent code, adds a "helpful" abstraction layer, and introduces two new edge cases. The problem isn't the model — it's the absence of constraints. AI coding agents are prompt-optimizers: they fill ambiguity with creativity. CLAUDE.md removes the ambiguity. It replaces "be careful" with concrete, actionable, negative-example-rich directives that survive long conversational contexts. This article breaks down the rules that actually work, the failure mode each one closes, and how to choose between the two leading implementations.

Why do AI coding agents keep making the same mistakes — and how does CLAUDE.md fix this at the system level?

AI coding agents fail in predictable patterns. The Claude Code Pro Pack's documentation — built from real-world agent failures across 30+ codebases — identifies four root causes:

Silent assumptions: The agent guesses your intent when requirements are vague. It builds what it thinks you want, not what you actually want.
Overbuilt code: A simple feature request triggers a cascade of "while I'm here" improvements — abstractions, refactors, helper utilities — none of which you asked for.
Unintended edits: The agent touches adjacent code, renames variables, reformats files, and cleans up "messy" patterns that were intentional.
Scope creep: A focused task ("add error logging to the payment handler") expands into a system-wide logging framework with configurable backends.

CLAUDE.md works as a behavioral control layer rather than a prompt. Traditional prompting says "please do X carefully." CLAUDE.md says:

"Surface uncertainty — if requirements are unclear, ask"
"Keep changes surgical — touch only what the task requires"
"Choose simplicity — write the minimum code that correctly solves the problem"

The difference is specificity. "Be careful" doesn't survive 50 turns of conversation. "Do not refactor, rename, reformat, or clean unrelated code" does.

"Past ~200 lines of CLAUDE.md, compliance drops sharply — rules get buried. The pack holds at 12 rules + minimal boilerplate so the agent actually reads and follows the file." — claude-code-pro-pack README

This token-efficiency constraint is underappreciated. CLAUDE.md is prepended to every agent context. Every line costs tokens on every call. The 12-rule pack clocks at ~700 tokens total — roughly the cost of a single paragraph of prose. The Ten Commandments version is even leaner at ~400 tokens.

According to Anthropic's Claude Code documentation, CLAUDE.md is one of the primary customization mechanisms alongside skills, hooks, and MCP servers. It's the first thing Claude reads when a session starts. The file sits in your project root or ~/.claude/ and is automatically loaded — no plugin, no /import, no configuration.

What were Karpathy's original 4 rules, and how did they cut error rates from 40% to 11%?

Karpathy's original CLAUDE.md established four rules as the minimum viable constraint set:

Rule 1: Clarify before implementing

The agent must restate the problem, goal, and expected outcome before writing code. This blocks the silent assumption failure mode. If the agent restates something wrong, you catch it before a single file changes.

Rule 2: Simplicity first

The agent must write the minimum code that solves the problem. No speculative features, no generic abstractions, no "future-proofing." This blocks overbuilt code.

Rule 3: Surgical changes only

The agent must touch only what the task requires. Match existing style. Do not refactor, rename, reformat, or clean unrelated code. This blocks unintended edits.

Rule 4: Verify before claiming success

The agent must run tests, lint, type checks, and confirm output before reporting completion. This blocks the "I fixed it" (didn't run anything) failure.

The 4 rules cut error rates from ~40% to ~11% because they target the four highest-frequency failure categories. Each rule is a negative constraint — it tells the agent what NOT to do — which research shows is more effective than positive guidance ("be helpful") for AI behavior control.

The 11% remaining errors come from failure modes the original rules don't cover: debugging spirals (the agent loops on a bug, burning tokens), pattern pollution (the agent sees two codebase patterns and averages them), silent partial failures (the agent catches one error but misses its downstream effects), and duplicate-function drift (creating near-identical functions in different files).

What 8 additional rules does the 12-rule pro pack add, and which failure mode does each address?

The claude-code-pro-pack extends Karpathy's 4 rules with 8 more, each targeting a specific agent-orchestration failure:

Rule	What it addresses	The failure it closes
5. Hard token budget	Token-spiral debugging	Agent loops 20+ iterations on a bug, burning 100K tokens
6. Surface conflicts, don't average	Two-pattern pollution	Agent sees two conventions in codebase and produces a third
7. Read before you write	Uninformed edits	Agent modifies a function without understanding its callers
8. Tests gated by correctness, not "pass"	Fake green tests	Agent writes a test that passes trivially but doesn't verify the fix
9. Long-running operations need checkpoints	Lost progress on failure	A 50-file refactor fails at file 47 with no saved state
10. Convention beats novelty	Inconsistent codebase	Agent introduces new patterns that clash with existing conventions
11. Fail visibly, not silently	Silent partial failures	Error swallowed by try/catch, agent reports success
12. Don't make the model do non-language work	Inefficient task routing	Agent uses LLM loop for retries/validation instead of deterministic code

The most impactful of these in practice is rule 5 — hard token budget. The agent's natural response to a failing test is "try again." Without a budget, this becomes a spiral: try, fail, try differently, fail, until context exhaustion. The rule forces the agent to stop after a defined number of attempts and surface the impasse to the user.

Rule 7 — read before you write — prevents the most common "confident wrong answer" scenario: the agent modifies a function signature without checking its call sites, breaking the build in files it never touched.

The full rationale for each rule is documented in the pro pack's docs/why-12-rules.md, with every rule citing a real failure it closes rather than a preference.

How do the "Ten Commandments for Coding Agents" differ from the 12-rule approach — and which should you use?

Both approaches are drop-in, open-source, MIT-licensed constraint files. They differ in philosophy, scope, and tooling:

	12-Rule Pro Pack	Ten Commandments
Rule count	12	10
Token cost	~700 tokens	~400 tokens
Philosophy	Extension of Karpathy's work	"Smallest set of rules that blocks all failures"
Skill templates	5 example skills (TDD, debugging, PR workflow, etc.)	None — rules only
Install method	Copy file or GitHub Action	curl one-liner or git clone + symlink
Cross-tool support	Claude, Codex, Cursor, Hermes, Copilot	All agents.md readers (Claude, Codex, Gemini CLI, OpenCode, Cursor)
Negative examples	Per-rule failure modes in separate doc	Inline within some rules
Repository rules section	Project-specific block at bottom (edit for your team)	Same — project conventions section
Standout feature	Includes `docs/adoption-guide.md` for 10-min team setup	Symlink strategy for single-source-of-truth across multiple CLIs

Which should you use?

Use the 12-Rule Pro Pack if:

You want the most comprehensive coverage (every known failure mode addressed)
You want skill templates (TDD loop, systematic debugging, PR workflow) included
Your team is 3+ developers and needs a shared behavior baseline
You want explicit Karpathy provenance (built on the original 4 rules)

Use the Ten Commandments if:

You use multiple AI coding tools across your workflow (the symlink trick is elegant)
Token efficiency matters — 400 tokens is about half the cost of the 12-rule pack
You prefer the "commandments" framing — imperative directives with named failure modes inline
You're a solo developer who wants minimal overhead

Both work. The Ten Commandments author themselves note: "If your fork grows past ~20 rules, you have a wiki, not a system prompt." The 12-rule pack author says: "Use all three — pack for behavior, anthropic/skills for domain tasks, addyosmani/agent-skills for lifecycle flow."

What does a surgical change look like in practice (and what happens when agents ignore rule #5)?

Rule 5 (12-rule pack) / Rule 5 (Ten Commandments) both address the same failure: keep changes surgical. Here's a concrete example.

Task: Fix a null-pointer exception in PaymentService.processRefund() when transaction.merchant is null after a failed payment gateway response.

Agent without surgical-change rules (what happens):

- public RefundResult processRefund(Transaction transaction) {
+ public RefundResult processRefund(Transaction transaction) {
+     // Added null safety
+     if (transaction == null) throw new IllegalArgumentException("transaction required");
      RefundRequest request = buildRequest(transaction);
-     GatewayResponse response = gateway.refund(request);
+     GatewayResponse response = gateway.refund(request);
+     // Extracted to separate method for testability
+     return handleGatewayResponse(response, transaction);
+ }
+
+ private RefundResult handleGatewayResponse(GatewayResponse response, Transaction tx) {
+     if (response.isSuccess()) {
+         return RefundResult.success(tx.getId());
+     }
+     // Added merchant null handling
+     if (tx.getMerchant() == null) {
+         log.warn("Merchant information missing for transaction {}", tx.getId());
+     }
+     return RefundResult.failure(response.getError());
  }

Three things happened that weren't asked for: (1) the method was split into two, (2) a new null check was added at the top, (3) the gateway.refund() variable was renamed. This touches 4 lines that didn't need changing and introduces a new method the team didn't agree on.

Agent with surgical-change rules (what was asked for):

  public RefundResult processRefund(Transaction transaction) {
      RefundRequest request = buildRequest(transaction);
      GatewayResponse response = gateway.refund(request);
-     return RefundResult.success(transaction.getId());
+     if (response.isSuccess()) {
+         return RefundResult.success(transaction.getId());
+     }
+     if (transaction.getMerchant() == null) {
+         log.warn("Merchant information missing for transaction {}", transaction.getId());
+     }
+     return RefundResult.failure(response.getError());
  }

One change, directly addressing the null pointer. No extracted methods, no input validation refactor, no variable renames.

The surgical approach isn't about writing worse code — it's about scope discipline. The refactored version might be genuinely better code. But when an AI agent introduces structural changes you didn't ask for, you lose the ability to reason about what else might have changed. The surgical rule preserves your ability to review the diff with confidence that everything you see is intentional.

How do you customize CLAUDE.md rules for your specific stack without breaking the system?

Customization follows two levels: repository rules and fork-and-extend.

Level 1: Repository rules (edit in place)

Both the 12-rule pack and Ten Commandments include a "Repository Rules" section at the bottom for project-specific conventions. Edit these without touching the core rules:

## Repository Rules
- We use pnpm, not npm or yarn. Use `pnpm install`, `pnpm test`, etc.
- Never modify `schema.prisma` directly — use `pnpm db migrate`
- Test files live next to their source files, not in a `__tests__` directory
- Prefer server components over client components. Only add 'use client' when necessary
- Auth is handled by NextAuth.js with the credentials provider. Do not add new auth libraries

These should be imperative directives, not descriptions. "Use X, not Y" works. "We use X for Y" gets ignored by the agent after 20 turns of context.

Level 2: Fork and extend (add custom rules)

If you encounter a failure mode the existing rules don't cover, fork and add a rule. The criterion for adding a new rule:

One sentence
Maps to a real incident (not a hypothetical preference)
Does not duplicate an existing rule

Example of a good custom rule:

13. Never import from barrel files in package internals. Use direct imports to avoid circular dependency cycles.

This maps to a real incident (your build broke from a circular dependency), is one sentence, and doesn't duplicate any existing rule.

Example of a bad custom rule:

13. Write good code that follows best practices and is maintainable over time.

This is a preference, not a directive. It doesn't map to a specific failure mode. The agent will ignore it.

Anti-patterns to avoid

Don't add tool-specific rules: "Use npm test not jest" belongs in Repository Rules, not as a new commandment
Don't add style rules: Prettier and ESLint handle formatting; CLAUDE.md shouldn't
Don't go past ~15 rules: If you have 20 rules, audit them. Cut the ones that haven't prevented a real incident
Don't describe your architecture: "We use hexagonal architecture with domain-driven design" is a wiki page, not a behavioral constraint

The cc-audit tool (from the pro pack ecosystem) scores any CLAUDE.md against the 12-rule baseline — use it in CI to enforce rule quality across your team.

Frequently Asked Questions

Q: Does CLAUDE.md work with non-Claude tools like Cursor or Codex?

Yes. Both Cursor and Codex read AGENTS.md or CLAUDE.md from your project root. The Ten Commandments maintain identical content in both file formats specifically for cross-tool compatibility. The 12-rule pack provides both CLAUDE.md and AGENTS.md variants.

Q: Can CLAUDE.md rules conflict with my existing .cursorrules or copilot-instructions?

They can. If your .cursorrules says "add comprehensive error handling" and your CLAUDE.md says "choose simplicity," the agent may produce inconsistent output. Pick one behavioral baseline and use it everywhere. The arai tool can enforce instruction files via hooks to prevent conflicts.

Q: Will these rules make the agent too conservative and miss edge cases?

No. The rules block unwanted behavior, not necessary behavior. An agent with surgical-change rules will still handle edge cases — it just won't restructure your codebase while doing it. The hard token budget rule prevents spiraling, not standard error handling.

Q: How do I verify my CLAUDE.md is actually working?

Watch for reduced chatter. An effective CLAUDE.md produces fewer clarifying questions, shorter diffs, and higher first-attempt success rates. The cc-audit tool provides quantitative scoring. Empirically, if your agent produces 3-line diffs instead of 30-line diffs for bug fixes, the rules are working.

Q: Can I use different rule sets per project?

Yes. CLAUDE.md files are project-scoped. Have a strict 12-rule set for your production monorepo and a lightweight 4-rule set for your experimental side projects. You can also have a global ~/.claude/CLAUDE.md with baseline rules that all projects inherit.

Q: Do these rules work with non-English prompts?

The rules are language-agnostic — they constrain behavior, not output language. The Ten Commandments repository includes a Korean translation (README.ko.md) demonstrating cross-language applicability.

Glossary

CLAUDE.md: A markdown file in your project root or ~/.claude/ that Claude Code reads at the start of every session, containing behavioral rules and project conventions
AGENTS.md: The emerging cross-tool equivalent of CLAUDE.md, read by Codex, Gemini CLI, OpenCode, and Cursor
Surgical change: A code modification that touches only what the task requires, matching existing style without refactoring adjacent code
Token budget: A hard limit on consecutive debugging attempts, preventing the agent from spiraling into infinite retry loops
Two-pattern pollution: When an agent encounters two different conventions in a codebase and produces a third, averaging them instead of picking one
Rule compliance cliff: The threshold (~200 lines or ~15 rules) beyond which AI agents stop consistently following CLAUDE.md directives

Author

Ramsis Hammadi — AI/ML engineer specializing in GenAI, LLM engineering, and automation. Full bio →

Claude Code Ultraplan: Cloud-Based AI Planning in 2026 — A Hands-On Tutorial

Ramsis Hammadi — Thu, 14 May 2026 08:33:19 +0000

Claude Code Ultraplan: Cloud-Based AI Planning in 2026 — A Hands-On Tutorial

TL;DR Summary

Ultraplan offloads Claude Code's planning phase to a cloud session, keeping your terminal free while a structured plan is drafted remotely
You review plans in your browser with inline comments, emoji reactions, and section-level navigation — a richer surface than terminal text
Three ways to launch: the /ultraplan command, the ultraplan keyword in any prompt, or from a local plan's approval dialog
You choose where to execute: in the cloud (with PR creation) or teleport back to your terminal (with full local environment access)
Requires Claude Code v2.1.91+, a GitHub repo, and a Claude.ai account. Not available on Bedrock, Vertex, or Foundry

Direct Answer Block

Ultraplan is Anthropic's research preview feature that separates AI planning from execution by drafting structured plans in a cloud-based Claude Code session. You type a task in your CLI, Claude researches and drafts a plan remotely on Anthropic's infrastructure, and you review the plan in your browser — commenting on specific sections, asking for revisions, then choosing whether to execute in the cloud or pull the plan back to your terminal.

Introduction

Most AI coding tools conflate planning and execution. You describe a task, the agent starts editing files immediately, and if the plan is wrong you discover it 15 minutes into a broken implementation. Ultraplan breaks that cycle. It gives you a browser-based review surface where you can inspect every section of a plan, comment on specific parts, and iterate before a single line of code changes. For complex multi-step changes — migrations, refactors, architectural shifts — this changes the review dynamic entirely.

What is Anthropic Ultraplan and what problem does it actually solve?

Ultraplan is not just "plan mode in the cloud." It's a structural separation of planning from execution that solves a specific terminal UX limitation: long plans are hard to review in a 24-line scrollable prompt window.

"You write a task in the CLI, and Claude drafts a structured plan in a cloud session you can review in your browser. This separates planning from execution and gives you a clearer way to inspect and edit multi-step changes before running them." — AlphaSignal summary of Anthropic's Ultraplan announcement

The core problem it addresses is plan review friction. In local plan mode, Claude produces a plan in the terminal — you read paragraphs of text, type a response, and Claude re-drafts. If you want to comment on a specific section, you have to quote it or describe its position. With Ultraplan, the plan appears in a web interface where you can:

Highlight any passage and leave an inline comment for Claude to address
React with emojis (thumbs up, thinking face) to signal approval or concern without writing full feedback
Jump between sections via an outline sidebar

This matters most for changes spanning 5+ files with architectural implications. The terminal review of such a plan takes patience; the browser review takes seconds.

According to Anthropic's docs, the cloud session runs on your account's default cloud environment. If you don't have one, Ultraplan creates it automatically on first launch.

How do you launch Ultraplan from the CLI (and what are the three ways to trigger it)?

There are three trigger methods, from explicit to incidental:

Method 1: The `/ultraplan` command (explicit)

/ultraplan migrate the auth service from sessions to JWTs

This is the most intentional path. Type the slash command with your task, confirm the dialog, and Claude launches a remote session. The CLI shows a status indicator:

Status	Meaning
`◇ ultraplan`	Claude is researching your codebase and drafting the plan
`◇ ultraplan needs your input`	Claude has a clarifying question; open the session link
`◆ ultraplan ready`	The plan is ready to review in your browser

Method 2: The `ultraplan` keyword (implicit)

Include the word "ultraplan" anywhere in a normal prompt:

Help me plan a refactor of the payment service — use ultraplan

Same result, less typing. This path also shows a confirmation dialog before launching.

Method 3: From a local plan (iterative)

When Claude finishes a local plan and presents the approval dialog, select No, refine with Ultraplan on Claude Code on the web. This sends your existing draft to the cloud for richer iteration. This path skips the confirmation dialog since selecting the option is already consent.

Run /tasks at any point to see the Ultraplan entry, open detail view with the session link, or stop the plan. Stopping archives the cloud session and clears the indicator; nothing is saved to your terminal.

Important constraint: If Remote Control is active, it disconnects when Ultraplan starts because both features occupy the claude.ai/code interface — only one can be connected at a time.

How does reviewing and iterating on a plan work in the browser?

When the status changes to ◆ ultraplan ready, open the session link. The plan appears in three zones:

The plan document — a structured breakdown of the proposed changes, organized by sections (migration steps, file changes, testing strategy, risks)
Inline comments — highlight any text, leave a comment, and Claude revises that specific section in response
Outline sidebar — navigable section index for jumping between parts of the plan

Here's the iteration loop:

You highlight a section like "Proposed database migration: 3-step rollout with rollback" and comment: This assumes zero-downtime. Can we add a step for staging validation first?
Claude revises the plan, inserting the staging validation step
You react with a thinking face emoji on the rollback strategy, signaling uncertainty
Claude proposes an alternative rollback mechanism
You approve

This is fundamentally different from terminal-based iteration. In the terminal, you'd need to: read the full plan, type a revision request covering multiple sections, hope Claude understood which sections you meant. In the browser, your feedback is surgically attached to specific text.

The plan document also supports emoji reactions at the section level. These are lightweight signals — a thumbs up means "this section looks right," a thinking face means "reconsider this" — that let you communicate without typing.

Should you execute your Ultraplan in the cloud or teleport it back to your terminal?

When the plan is approved, you pick from two execution paths from the browser:

Execute on the web

Select Approve Claude's plan and start coding to have Claude implement the plan in the same cloud session. Your terminal shows a confirmation, the status indicator clears, and work continues in the cloud. When done, you review the diff and create a pull request from the web interface.

Best for: When you don't need local access (environment variables, private dependencies, local services). The cloud session has your repo but not your machine's runtime.

Teleport back to terminal

Select Approve plan and teleport back to terminal to pull the plan into your local CLI session. The cloud session archives, and your terminal shows three options:

Implement here: inject the plan into your current conversation
Start new session: clear context, begin fresh with only the plan
Cancel: save the plan to a file without executing (Claude prints the file path)

If you start a new session, Claude prints a claude --resume command so you can return to your previous conversation.

Best for: When you need local environment access, private dependencies, or are running integration tests against local services. The plan lands in your terminal with full access to your machine.

Factor	Web Execution	Terminal Execution
Environment access	GitHub repo only	Full local machine
PR creation	Built-in from web UI	Manual
Terminal stays free	Yes	No (implementation uses it)
Review surface	Browser diff view	Terminal or IDE
Context preservation	Cloud session	Local session

When should you use Ultraplan instead of local plan mode (and when should you NOT)?

Use Ultraplan when:

The change spans 5+ files with architectural implications — you need a rich review surface
You want hands-off drafting — Ultraplan runs remotely, your terminal stays free for other work
You're on a team and want async plan review — share the browser link, get comments before execution
The plan needs multiple iterations — inline comments are faster than terminal-based revision prompts

Use local plan mode (`/plan` or `Shift+Tab` into plan mode) when:

The change is small and self-contained — 2-3 files, quick to review in terminal
You're on Bedrock, Vertex, or Foundry — Ultraplan requires Anthropic direct API and is not available on these providers
You don't have a Claude.ai web account — Ultraplan runs on Claude Code on the web infrastructure
You want instant iteration — local plan mode has no remote session startup time (typically 15-30 seconds for Ultraplan)

Do NOT use Ultraplan when:

Your organization requires Zero Data Retention (ZDR) — Ultraplan runs on cloud infrastructure where ZDR is not available
Your repository is not on GitHub — the cloud session needs a GitHub remote to clone and operate
You're working on sensitive code that cannot leave your machine — the repo is bundled and uploaded to Anthropic's cloud sandbox

How does Ultraplan compare to Ultrareview, and when should you use both?

Ultraplan and Ultrareview are siblings: one plans before work, the other reviews before merge.

	Ultraplan	Ultrareview
Stage	Before implementation	Before merge
Output	Structured plan document	Verified bug findings
Agents	Single cloud session	Fleet of reviewer agents
Verification	Human review of plan	Independent reproduction of each finding
Duration	1-5 minutes	5-10 minutes
Cost	Included in plan usage	Free runs then $5-$20/review
Trigger	`/ultraplan` or keyword	`/ultrareview` or `/ultrareview <PR#>`

The ideal workflow pairs both:

/ultraplan — plan a complex feature in the browser, iterate on architecture
Implement — execute in cloud or terminal
/ultrareview — run a multi-agent deep review before merging

Ultrareview has one distinct advantage over local /review: every reported finding is independently reproduced and verified by the agent fleet, so results focus on real bugs rather than style suggestions. It supports both branch diff mode (reviews changes against default branch) and PR mode (clones the PR from GitHub directly).

Ultrareview includes a non-interactive mode for CI: claude ultrareview runs headless, prints findings to stdout, and exits with code 0 on success or 1 on failure. Pass --json for raw output or --timeout <minutes> to limit wait time.

Frequently Asked Questions

Q: Do I need a paid Claude subscription to use Ultraplan?

Ultraplan requires a Claude Code on the web account, which is tied to a Claude subscription (Pro, Max, Team, or Enterprise). It's not available with API-key-only authentication.

Q: What happens to my code during Ultraplan?

Your repository state is bundled and uploaded to Anthropic's cloud sandbox for plan drafting. The sandbox is ephemeral — destroyed when the session ends. For Ultrareview in PR mode, the sandbox clones directly from GitHub rather than uploading your local state.

Q: Can I use Ultraplan without a GitHub repo?

No. The cloud session needs a GitHub remote to clone and operate on your codebase. If your repo is on GitLab or Bitbucket, Ultraplan is not available.

Q: How much does Ultraplan cost?

Ultraplan counts toward your plan's included usage. It does not bill as extra usage like Ultrareview does. You pay only in token consumption from the planning session.

Q: Can multiple people review the same Ultraplan?

Yes. Share the browser session link with teammates. They can view the plan, leave comments, and react to sections. Only one person's feedback drives Claude's revisions at a time, but multiple people can participate in the review.

Q: Is Ultraplan available on the VS Code extension?

Ultraplan is launched from the CLI. If you're using Claude Code inside VS Code's integrated terminal, the /ultraplan command works there too. The browser review interface is separate from VS Code.

Glossary

Ultraplan: A cloud-based planning feature that drafts Claude Code plans remotely, reviewable in a browser
Ultrareview: A cloud-based code review feature that uses multiple AI agents to find and verify bugs before merge
Plan mode: Local Claude Code mode (/plan or Shift+Tab) that researches and proposes changes without editing files
Teleport: The mechanism for pulling a cloud-drafted plan back into a local terminal session for execution
Cloud session: A Claude Code session running on Anthropic's managed infrastructure, not your local machine
Ultraplan ready: The status indicator confirming the cloud-drafted plan is available for browser review

Author

Ramsis Hammadi — AI/ML engineer specializing in GenAI, LLM engineering, and automation. Full bio →

DEV Community: Ramsis Hammadi

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

TL;DR Summary

Direct Answer Block

Introduction

What is the OpenAI Agents SDK's "model-native harness" and how does it change agent development?

How does sandbox execution work — and how does it keep agent code safe in production?

Sandbox clients

What file system tools, MCP integration, and storage systems does the SDK support?

File system tools

MCP integration

Storage systems

Manifest

How do you define an agent manifest with inputs, outputs, directory structure, and provider config?

How does credential isolation work across Cloudflare, Vercel, and custom deployment environments?

How do you orchestrate multi-agent workflows with handoffs, guardrails, and human-in-the-loop approvals?

Handoffs

Guardrails

Human-in-the-loop

Capabilities

Advanced patterns (from OpenAI's examples)

Frequently Asked Questions

Q: Do I need a sandbox for every agent?

Q: Can I use the Agents SDK with non-OpenAI models?

Q: How much do sandbox runs cost?

Q: Can sandbox state survive between runs?

Q: Is sandbox execution available in both TypeScript and Python SDKs?

Q: How does this differ from Claude Code's sandbox approach?

Glossary

Author

CLAUDE.md Rules: How to Cut AI Coding Mistakes from 40% to 3% in 2026

CLAUDE.md Rules: How to Cut AI Coding Mistakes from 40% to 3% in 2026

TL;DR Summary

Direct Answer Block

Introduction

Why do AI coding agents keep making the same mistakes — and how does CLAUDE.md fix this at the system level?

What were Karpathy's original 4 rules, and how did they cut error rates from 40% to 11%?

Rule 1: Clarify before implementing

Rule 2: Simplicity first

Rule 3: Surgical changes only

Rule 4: Verify before claiming success

What 8 additional rules does the 12-rule pro pack add, and which failure mode does each address?

How do the "Ten Commandments for Coding Agents" differ from the 12-rule approach — and which should you use?

Which should you use?

What does a surgical change look like in practice (and what happens when agents ignore rule #5)?

How do you customize CLAUDE.md rules for your specific stack without breaking the system?

Level 1: Repository rules (edit in place)

Level 2: Fork and extend (add custom rules)

Anti-patterns to avoid

Frequently Asked Questions

Q: Does CLAUDE.md work with non-Claude tools like Cursor or Codex?

Q: Can CLAUDE.md rules conflict with my existing .cursorrules or copilot-instructions?

Q: Will these rules make the agent too conservative and miss edge cases?

Q: How do I verify my CLAUDE.md is actually working?

Q: Can I use different rule sets per project?

Q: Do these rules work with non-English prompts?

Glossary

Author

Claude Code Ultraplan: Cloud-Based AI Planning in 2026 — A Hands-On Tutorial

Claude Code Ultraplan: Cloud-Based AI Planning in 2026 — A Hands-On Tutorial

TL;DR Summary

Direct Answer Block

Introduction

What is Anthropic Ultraplan and what problem does it actually solve?

How do you launch Ultraplan from the CLI (and what are the three ways to trigger it)?

Method 1: The /ultraplan command (explicit)

Method 2: The ultraplan keyword (implicit)

Method 3: From a local plan (iterative)

How does reviewing and iterating on a plan work in the browser?

Should you execute your Ultraplan in the cloud or teleport it back to your terminal?

Execute on the web

Teleport back to terminal

When should you use Ultraplan instead of local plan mode (and when should you NOT)?

Use Ultraplan when:

Use local plan mode (/plan or Shift+Tab into plan mode) when:

Do NOT use Ultraplan when:

How does Ultraplan compare to Ultrareview, and when should you use both?

The ideal workflow pairs both:

Frequently Asked Questions

Method 1: The `/ultraplan` command (explicit)

Method 2: The `ultraplan` keyword (implicit)

Use local plan mode (`/plan` or `Shift+Tab` into plan mode) when: