Ramsis Hammadi

Posted on May 16

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

#openai #ai #claude #agents

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

TL;DR Summary

The OpenAI Agents SDK now includes sandbox execution — agents run code, access files, and use shell commands in isolated container-based workspaces
A model-native harness replaces custom orchestration code: the SDK handles tool dispatch, state persistence, and multi-step workflows
Sandboxes support filesystem, shell, package installs, Git repos, mounted storage (S3/GCS/R2), exposed ports, snapshots, and resumable state
The agent and sandbox are deliberately separate — harness owns the control plane (model calls, tool routing, approvals), sandbox owns execution (files, commands)
Deploy on Unix-local (dev), Docker (local container), or hosted providers (Cloudflare, Vercel) with the same agent definition

Direct Answer Block

The OpenAI Agents SDK is a code-first framework for building production AI agents in TypeScript or Python. Its sandbox feature gives agents an isolated Unix-like workspace with filesystem, shell, mounted data, and resumable state. The model-native harness handles tool dispatch, multi-step execution, and state persistence — replacing the custom orchestration code you'd otherwise write yourself.

Introduction

Before the Agents SDK's sandbox update, building a production AI agent that could safely execute code required stitching together: a model API client, a container runtime, credential isolation, state persistence, tool routing, and approval logic. Each piece was custom code. The SDK collapses that stack: define your agent with a manifest describing the workspace, attach capabilities (shell, filesystem, skills, memory), and pick a sandbox client. The harness handles everything between model turns.

What is the OpenAI Agents SDK's "model-native harness" and how does it change agent development?

The model-native harness is a runtime layer that matches how models naturally use tools and context. According to the newsletter reporting OpenAI's announcement, it "runs agents in a way that matches how models naturally use tools and context."

In practice, this means the harness owns:

Tool dispatch: when the model calls shell or file_read, the harness routes the call to the correct sandbox tool
State persistence: conversation state, tool results, and workspace state survive across model turns
Multi-step execution: the agent loop continues across turns, with each step observable and cancellable
Streaming: responses stream back to the application as the agent works
Recovery: if a sandbox session stops, the harness can resume from serialized state

The pre-harness approach required developers to write this orchestration themselves — wrapping every tool call, managing conversation state, handling tool errors, and building resumption logic. The harness replaces that with a structured runtime.

OpenAI's Agents SDK documentation positions it as the code-first path: "use the SDK track when your server owns orchestration, tool execution, state, and approvals." For hosted workflow creation without code, use Agent Builder. For direct model API access, use the client libraries.

The SDK separates agent definitions from execution boundaries. A SandboxAgent is still an Agent — it keeps instructions, prompt, tools, handoffs, MCP servers, model settings, and hooks. What changes is where execution happens: a live sandbox session with its own filesystem, commands, and ports.

How does sandbox execution work — and how does it keep agent code safe in production?

The sandbox is an isolated, Unix-like execution environment with filesystem, shell, installed packages, mounted data, exposed ports, and resumable state. The key architectural decision: the agent harness and sandbox compute are separate.

"The key split is the boundary between the harness and compute. The harness is the control plane around the model: it owns the agent loop, model calls, tool routing, handoffs, approvals, tracing, recovery, and run state. Compute is the sandbox execution plane where model-directed work reads and writes files, runs commands, installs dependencies, uses mounted storage, exposes ports, and snapshots state." — OpenAI Sandbox Agents documentation

This separation matters for production safety:

Control plane stays in trusted infrastructure — the harness keeps auth, billing, audit logs, human review, and recovery state outside any single container
Sandbox is an execution environment, not the control plane — it runs commands and edits files but doesn't own model decisions
Credentials isolate from agent code — sandbox credentials are runtime configuration, not prompt content. OpenAI's docs explicitly warn: "Treat sandbox credentials as runtime configuration, not prompt content."

The difference between running the harness inside the sandbox vs separate from it is a product decision. Inside-sandbox is convenient for prototypes. Separate-sandbox is the production pattern — the harness keeps sensitive control plane operations in your infrastructure while sandboxes handle provider-specific execution.

According to the newsletter, the SDK "keeps credentials outside execution environments where model-generated code runs" — a critical security boundary when agents can generate and execute arbitrary code.

Sandbox clients

Client	Use case
UnixLocal	Local development on macOS/Linux. Creates temp workspace, cleans up after run
Docker	Local container isolation with custom images
Hosted providers	Cloudflare, Vercel — production deployment with provider-specific isolation

The sandbox client is part of run configuration, not agent definition. Keep the agent, manifest, and capabilities stable, then swap the client per environment.

What file system tools, MCP integration, and storage systems does the SDK support?

File system tools

The SDK provides file system primitives that the agent uses to interact with workspace files:

File reads and writes — read project directories, edit source files, create new files
Apply patch — apply diffs to workspace files
View image — inspect local images in the sandbox
Shell commands — execute arbitrary commands with interactive input support

MCP integration

MCP (Model Context Protocol) enables structured tool use for external APIs and services. According to the newsletter, "MCP enables structured tool use for external APIs and services."

MCP servers connect through the SDK's integration layer, allowing agents to use tools from:

Communication (Slack, Discord)
Project management (Linear, Jira)
Data sources (databases, Google Drive)
Custom APIs (your internal services)

Storage systems

The manifest supports mounting external storage directly into the sandbox:

Mount type	Use case
S3 Mount	Data room files, generated artifacts
GCS Mount	Google Cloud Storage datasets
R2 Mount	Cloudflare storage
Azure Blob	Azure data
Box Mount	Box cloud storage
S3 Files Mount	Individual files from S3

OpenAI's docs recommend: "Keep mounted storage scoped to the inputs the agent should read or write. Treat mount entries as ephemeral workspace entries."

Manifest

The manifest describes the workspace contract for a fresh sandbox session — files, repos, input artifacts, output directories, environment variables, and OS users/groups. It's treated as a starting-point contract, not the full source of truth.

How do you define an agent manifest with inputs, outputs, directory structure, and provider config?

A manifest defines what the agent sees when a sandbox session starts. Here's a practical example from OpenAI's sandbox quickstart:

TypeScript:

const manifest = new Manifest({
  entries: {
    "account_brief.md": file({
      content: "# Northwind Health\n" +
        "- Segment: Mid-market healthcare analytics provider.\n" +
        "- Renewal date: 2026-04-15.\n",
    }),
    "implementation_risks.md": file({
      content: "# Delivery risks\n" +
        "- Security questionnaire is not complete.\n" +
        "- Procurement requires final legal language by April 1.\n",
    }),
  },
});

Python:

manifest = Manifest(
    entries={
        "account_brief.md": File(
            content=b"# Northwind Health\n...\n"
        ),
        "implementation_risks.md": File(
            content=b"# Delivery risks\n...\n"
        ),
    }
)

Manifest inputs cover:

Input type	What it provides
`File` / `Dir`	Synthetic inputs, helper files, output directories
Local file/directory	Host files materialized into sandbox
Git repo	Repository cloned into workspace
Storage mounts	S3, GCS, R2, Azure Blob, Box
`environment`	Startup environment variables
`users` / `groups`	Sandbox-local OS accounts

Design rules from OpenAI's docs:

Put repos, input artifacts, and output directories in the manifest
Put task specs and instructions in workspace files (repo/task.md, AGENTS.md)
Use relative workspace paths in instructions
Keep mounts scoped to inputs the agent should use
Avoid saving secrets, tokens, or sensitive files in the manifest

How does credential isolation work across Cloudflare, Vercel, and custom deployment environments?

Credential isolation is a first-class design concern in the sandbox architecture. The principle: credentials are runtime configuration, not prompt content.

OpenAI's sandbox docs specify three rules:

Prefer provider-native secret systems for hosted sandbox providers
Keep cloud storage credentials scoped to the specific mount or provider option
Use Manifest.environment for startup values, marking sensitive entries as ephemeral

According to the newsletter, the SDK "keeps credentials outside execution environments where model-generated code runs." This means:

The agent prompt never contains API keys, tokens, or secrets
Sandbox environment variables are injected by the provider, not by the model
Cloud provider deployments (Cloudflare Workers, Vercel Functions) isolate credentials from sandbox compute

The provider is part of run configuration, not agent definition. The same agent with the same manifest can run on UnixLocal for development, Docker for local container testing, and a hosted provider for production — credentials are configured per provider, per environment.

OpenAI's documentation warns: "Review artifacts before moving them out of the sandbox, especially when the agent can read private documents or mounted storage." The sandbox can access mounted data — your application should verify what comes out.

How do you orchestrate multi-agent workflows with handoffs, guardrails, and human-in-the-loop approvals?

The Agents SDK includes orchestration primitives that layer on top of the sandbox foundation:

Handoffs

When a task requires multiple specialists, handoffs transfer control between agents. Each agent owns its domain. The harness routes based on the handoff target.

Guardrails

Guardrails run before or after model turns to validate output or block unsafe actions. According to the SDK docs, guardrails and human review "block or pause before risky work continues."

Human-in-the-loop

For high-risk operations, the workflow pauses for human approval. The sandbox state persists during the pause — when approved, the agent continues in the same workspace with the same files and context.

Capabilities

Each sandbox agent gets capabilities attached to its definition:

Capability	What it adds
Shell	Command execution with interactive input
Filesystem	File edits (apply_patch) and image viewing
Skills	Skill discovery and materialization from local dirs or Git repos
Memory	Persist memory artifacts across runs (requires Shell + Filesystem)
Compaction	Context trimming for long-running flows

By default, a SandboxAgent includes filesystem, shell, and compaction. If you pass a custom capabilities list, it replaces the defaults — include them explicitly if needed.

Advanced patterns (from OpenAI's examples)

Data room Q&A: Answer questions over mounted documents
Repository code review: Clone a repo, inspect it, produce review artifacts
Vision website clone: Clone a website using Vision API and screenshot feedback
Sandbox resume: Resume work in a pre-existing sandbox session

Frequently Asked Questions

Q: Do I need a sandbox for every agent?

No. If your agent only needs model responses without files, commands, or persistent state, use the Responses API directly or the basic Agents SDK runtime. Sandboxes are for when the answer depends on workspace work.

Q: Can I use the Agents SDK with non-OpenAI models?

The SDK supports provider configuration, allowing different model providers per agent. Sandbox execution is independent of model choice — the harness handles tool routing regardless of which model generates the tool calls.

Q: How much do sandbox runs cost?

Sandbox pricing depends on the provider (UnixLocal is free, hosted providers bill per session). OpenAI's API usage is separate from sandbox compute costs. Check provider-specific pricing.

Q: Can sandbox state survive between runs?

Yes. Three persistence levels: RunState (harness-side state), serialized session state (reconnect to same sandbox), and snapshots (save workspace contents to seed a fresh session). Use snapshots to skip dependency installation on subsequent runs.

Q: Is sandbox execution available in both TypeScript and Python SDKs?

Yes. Both SDKs support the same sandbox primitives with language-idiomatic APIs. Official examples exist for both.

Q: How does this differ from Claude Code's sandbox approach?

Both separate agent from execution, but OpenAI's SDK is a code-first framework you integrate into your application, while Claude Code is a product you run. OpenAI's approach gives you programmatic control over the harness, manifests, and provider selection.

Glossary

Model-native harness: The SDK runtime layer that handles tool dispatch, state persistence, and multi-step execution in a way that matches model behavior
Sandbox: An isolated, Unix-like execution environment with filesystem, shell, packages, mounts, ports, and resumable state
Manifest: The workspace contract describing what files, repos, mounts, and environment variables a fresh sandbox session starts with
Capabilities: Sandbox-native behaviors attached to an agent (shell, filesystem, skills, memory, compaction)
Handoff: Transfer of control between specialized agents within a multi-agent workflow
Snapshot: A saved workspace state used to seed a fresh sandbox session, skipping redundant setup

Author

Ramsis Hammadi — AI/ML engineer specializing in GenAI, LLM engineering, and automation. Full bio →

DEV Community

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

OpenAI Agents SDK: Sandbox Execution and Model-Native Harness in 2026

TL;DR Summary

Direct Answer Block

Introduction

What is the OpenAI Agents SDK's "model-native harness" and how does it change agent development?

How does sandbox execution work — and how does it keep agent code safe in production?

Sandbox clients

What file system tools, MCP integration, and storage systems does the SDK support?

File system tools

MCP integration

Storage systems

Manifest

How do you define an agent manifest with inputs, outputs, directory structure, and provider config?

How does credential isolation work across Cloudflare, Vercel, and custom deployment environments?

How do you orchestrate multi-agent workflows with handoffs, guardrails, and human-in-the-loop approvals?

Handoffs

Guardrails

Human-in-the-loop

Capabilities

Advanced patterns (from OpenAI's examples)

Frequently Asked Questions

Q: Do I need a sandbox for every agent?

Q: Can I use the Agents SDK with non-OpenAI models?

Q: How much do sandbox runs cost?

Q: Can sandbox state survive between runs?

Q: Is sandbox execution available in both TypeScript and Python SDKs?

Q: How does this differ from Claude Code's sandbox approach?

Glossary

Author

Top comments (0)