Jangwook Kim

Posted on May 8 • Originally published at effloow.com

Cloudflare Dynamic Workers: V8 Sandbox for AI Agent Code

#cloudflare #aiagents #v8isolates #sandboxing

Most AI agents that execute code do it the slow way: spin up a container, wait for it to boot, run the code, tear it down. That works, but containers are heavy — hundreds of milliseconds of startup time and 100+ MB of memory per sandbox.

Cloudflare's answer is Dynamic Workers, which entered open beta on March 24, 2026, and took center stage at Cloudflare Agents Week 2026 (April 13–17). The core idea: instead of containers, use V8 isolates — the same sandboxing primitive that has powered Cloudflare Workers since 2018 — and create one per AI request at runtime.

Effloow Lab ran a sandbox PoC to verify the npm packages, inspect the API surface, and map out the constraints developers need to know before building on this infrastructure.

What Dynamic Workers Actually Do

Standard Cloudflare Workers are deployed statically: you run wrangler deploy, and Cloudflare distributes your code to its edge network. Dynamic Workers flip this model. A parent Worker receives code at runtime — from an LLM, from user input, from an orchestration system — and instantiates a new isolated Worker with that code on the fly.

The technical primitive is the Worker Loader binding. You declare it in wrangler.toml:

[[worker_loader]]
binding = "LOADER"

Inside your parent Worker, the binding exposes a run() method:

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const generatedCode = await getLLMGeneratedCode(request);

    const result = await env.LOADER.run(generatedCode, {
      // Pass bindings the dynamic worker can use
      bindings: {
        SOME_KV: env.SOME_KV,
      },
    });

    return new Response(JSON.stringify(result));
  },
};

The dynamic Worker runs in its own V8 isolate. It has no knowledge of the parent Worker or other tenants. When execution completes, the isolate is discarded.

The V8 Isolate Advantage

Cloudflare's announcement post frames the performance story plainly: isolates start in a few milliseconds and use a few megabytes of memory — roughly 100× faster to boot and 10–100× more memory efficient than typical containers.

This isn't marketing math. V8 isolates are the same engine Chrome uses to run JavaScript tabs, hardened for multi-tenant server execution over eight years. Each isolate is a completely separated JavaScript context — code, heap, and execution state — but it runs inside a shared process. That shared-process model is what makes them fast: no OS process creation, no container filesystem mount, no network namespace setup.

For AI agents that might execute thousands of code snippets per hour, this changes the economics. A microVM-based solution like E2B uses Firecracker, which starts in around 150ms and costs roughly $0.01 per execution. Dynamic Workers start in milliseconds and cost around $0.002 per execution at Workers Standard rates, according to VentureBeat's April 2026 coverage.

Security Model: What the Sandbox Prevents

The V8 isolate is not a Linux container — it doesn't pretend to be one. These are the hard constraints every developer needs to internalize:

No filesystem access. The fs module is not exposed. There is no /tmp, no home directory, no shared disk. If LLM-generated code tries to read or write files, it throws.

No environment variables. process.env is unavailable. Secrets and configuration must be passed explicitly as Worker bindings — they cannot leak through the environment.

No network by default. The DynamicWorkerExecutor in @cloudflare/codemode sets globalOutbound: null out of the box. fetch() and WebSocket connections are blocked unless you explicitly allow an outbound binding. This is the right default for running untrusted AI-generated code.

No native modules. .node addons cannot load. Pure JavaScript and WASM packages work; anything requiring native compilation does not.

Cloudflare's broader security layer adds:

Automatic V8 security patch deployment to production within hours of disclosure
A second-layer custom sandbox using Linux namespaces and seccomp
Hardware-level isolation via Memory Protection Keys (MPK)
Spectre mitigations developed with academic researchers

The constraint list matters because it defines what LLMs can and cannot do inside the sandbox. Data transformation, string processing, JSON manipulation, algorithm execution, API calls (if outbound is enabled) — all fine. File I/O, subprocess spawning, native package usage — not possible.

The Supporting Package Ecosystem

Effloow Lab verified three npm packages that make Dynamic Workers practical for AI agent workflows as of May 8, 2026:

@cloudflare/codemode (v0.3.4)

This is the high-level abstraction for the "Code Mode" pattern: instead of giving an LLM dozens of individual tool calls, you give it one tool — "write code that uses these typed APIs" — and execute the code in a Dynamic Worker sandbox.

The package generates TypeScript type definitions from your existing tool functions. The LLM sees those types, writes a JavaScript snippet that calls the tools in whatever order and combination makes sense, and DynamicWorkerExecutor runs it.

import { createCodeTool, DynamicWorkerExecutor } from "@cloudflare/codemode";

const executor = new DynamicWorkerExecutor(env.LOADER, {
  globalOutbound: null, // no outbound network
  timeout: 30_000,      // 30 second max execution
});

const codeTool = createCodeTool({
  tools: myToolDefinitions, // your existing typed functions
  executor,
});

// codeTool is a single AI tool that the LLM uses to write + execute code

Peer dependencies: ai (Vercel AI SDK), zod, @tanstack/ai, and @modelcontextprotocol/sdk — all optional depending on which AI framework you're integrating with.

Cloudflare claims this pattern can reduce inference tokens by up to 80% for multi-step tasks, because the LLM writes a compact program once rather than calling tools iteratively with back-and-forth reasoning.

@cloudflare/worker-bundler (v0.1.3)

Dynamic Workers execute code strings — but real-world LLM-generated code often imports npm packages. @cloudflare/worker-bundler solves this by resolving npm dependencies and bundling everything with esbuild-wasm at runtime, entirely in-process. No native binaries required.

import { createWorker } from "@cloudflare/worker-bundler";

const { mainModule, modules } = await createWorker({
  files: {
    "index.js": generatedCode,
    "package.json": JSON.stringify({ dependencies: { lodash: "^4.17.21" } }),
  },
});

// Pass mainModule + modules to the Worker Loader binding
await env.LOADER.run(mainModule, { modules });

The use of esbuild-wasm (versus a native esbuild binary) is notable: it means bundling can happen inside a Worker itself, enabling fully serverless code compilation pipelines. The @typescript/vfs dependency provides in-memory TypeScript type-checking without touching the filesystem.

@cloudflare/shell (virtual filesystem)

Since Dynamic Workers have no real filesystem, @cloudflare/shell provides a POSIX-compatible in-memory abstraction. Files created inside a Dynamic Worker via the shell API exist only for the duration of that execution.

For persistence beyond a single execution, the package backs storage with SQLite (via Durable Objects) and R2. An AI agent can write results to @cloudflare/shell, and those results survive across multiple Dynamic Worker invocations for the same session.

Code Mode Pattern: Why This Matters for Agent Design

The Code Mode pattern deserves a section of its own because it changes how you structure agentic systems.

In a standard tool-call loop, an LLM calls one tool at a time: look up a user, then check their subscription, then fetch their recent orders. Each step requires a round-trip through the model — which means latency and token cost stack up linearly.

With Code Mode, you give the LLM your typed APIs and ask it to write a program:

Given these typed functions:
  getUser(id: string): User
  getSubscription(userId: string): Subscription
  getRecentOrders(userId: string, limit: number): Order[]

Write JavaScript to fetch user 'u_123', check if their subscription is active,
and return their last 5 orders if it is.

The LLM writes something like:

const user = await getUser("u_123");
const subscription = await getSubscription(user.id);
if (subscription.active) {
  return await getRecentOrders(user.id, 5);
}
return [];

Dynamic Workers execute this snippet with real access to your tool functions. Three tool calls happen in one Dynamic Worker invocation, with the logic handled by generated code rather than a reasoning loop. The model touches the problem once, writes the solution, and the sandbox runs it.

Architecture Decision: When to Use Dynamic Workers

Dynamic Workers are not a universal replacement for containers or microVMs. Here's how to think about the choice:

Scenario	Dynamic Workers	Containers / microVMs
Startup latency	Milliseconds	150ms–2s
Memory per sandbox	Few MB	100–500 MB
Cost per execution	~$0.002	~$0.01+
Filesystem access	In-memory only	Full
Native packages (.node)	Not supported	Supported
Language support	JS / TS / WASM	Any language
Network control	Block by default	Full access
Best for	JS/TS agents, high-frequency ephemeral execution	POSIX workloads, compilers, system tools

If your AI agent generates JavaScript or TypeScript to orchestrate your APIs, Dynamic Workers are the right infrastructure. If it generates Python for data analysis, runs system commands, or needs POSIX filesystem semantics, Firecracker-based microVMs like E2B or Cloudflare Sandboxes (their separate GA product) are more appropriate.

Pricing and Availability

Dynamic Workers require the Workers Paid plan at $5/month. There is no free tier. The plan includes 10 million requests and 30 million CPU milliseconds, with pay-as-you-go overages at Workers Standard rates.

Two billing dimensions apply to Dynamic Workers specifically:

Requests and CPU time — billed at Workers Standard rates (same as regular Workers). CPU time includes startup time, not just execution time.
Daily unique Dynamic Workers created — Cloudflare has announced this metric and pricing structure but as of this writing, it is not yet billed. A "unique" Dynamic Worker is identified by its Worker ID and code hash; if either changes, it counts as a new one.

The practical implication: at current pricing, Dynamic Workers are economically viable for high-frequency agent pipelines. If an agent executes 10,000 code snippets per day, the cost at Workers Standard rates is roughly $20 per month — comparable to a single E2B sandbox pool with a fraction of the isolation overhead.

Getting Started

The official getting started guide walks through the minimal setup:

Step 1 — Declare the Worker Loader binding

# wrangler.toml
name = "my-ai-agent"
main = "src/index.ts"
compatibility_date = "2026-03-01"

[[worker_loader]]
binding = "LOADER"

Step 2 — Install the supporting packages

npm install @cloudflare/codemode @cloudflare/worker-bundler

Step 3 — Create the parent Worker

import { DynamicWorkerExecutor, createCodeTool } from "@cloudflare/codemode";
import { z } from "zod";

interface Env {
  LOADER: WorkerLoader; // typed by @cloudflare/workers-types
}

// Define your tools
const tools = {
  fetchWeather: {
    description: "Fetch weather for a city",
    parameters: z.object({ city: z.string() }),
    execute: async ({ city }) => {
      // your actual implementation
      return { temperature: 22, condition: "sunny" };
    },
  },
};

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const executor = new DynamicWorkerExecutor(env.LOADER, {
      globalOutbound: null,
      timeout: 30_000,
    });

    const codeTool = createCodeTool({ tools, executor });

    // Pass codeTool to your AI model
    // The LLM writes code; executor runs it in a V8 isolate
    return new Response("Agent ready");
  },
};

Step 4 — Deploy

wrangler deploy

Dynamic Workers are available in all regions where Workers runs. No additional configuration is required beyond the wrangler.toml binding declaration.

Common Mistakes to Avoid

Assuming filesystem access exists. LLM-generated code that writes to /tmp will throw immediately. If your use case requires scratch file operations, add @cloudflare/shell and pass the shell instance to the dynamic Worker context.

Forgetting startup CPU billing. Dynamic Workers bill CPU time from the moment the isolate starts, not just when your code runs. Very short executions with complex import graphs may have a higher-than-expected startup cost. Use @cloudflare/worker-bundler to pre-bundle dependencies and minimize parse time.

Enabling network without scoping it. If you need outbound fetch inside a Dynamic Worker, configure a specific binding rather than enabling unrestricted outbound. This limits blast radius if LLM-generated code tries to exfiltrate data or make unexpected external calls.

Treating unique worker count as free forever. The daily unique Dynamic Workers metric exists and pricing is announced; billing will activate at some point. Design your system to reuse Worker code where possible rather than generating unique scripts for every execution.

FAQ

Q: Can Dynamic Workers run Python or other languages?

No. Dynamic Workers run V8-based JavaScript and TypeScript. For polyglot agent pipelines, Cloudflare Sandboxes (a separate product, also GA) supports additional runtimes via microVMs. Dynamic Workers are specifically optimized for JS/TS at high frequency.

Q: Are Dynamic Workers available on the free Workers plan?

No. Dynamic Workers require the Workers Paid plan ($5/month). The feature is not available on the free tier.

Q: How does Dynamic Workers relate to Cloudflare Sandboxes?

They are complementary products. Cloudflare Sandboxes provides isolated microVM environments (Firecracker-based) for workloads that need a full OS environment. Dynamic Workers use V8 isolates for JavaScript-only workloads at lower latency and cost. Choose based on the language and filesystem requirements of your agent's generated code.

Q: Is the Code Mode pattern limited to Cloudflare's AI models?

No. @cloudflare/codemode integrates via the Vercel AI SDK (ai package) and has peer dependencies on @modelcontextprotocol/sdk and @tanstack/ai. You can use it with any AI model that supports tool calls — Anthropic Claude, OpenAI GPT, Google Gemini, or local models via compatible SDKs.

Q: What happens to the Dynamic Worker after execution?

The V8 isolate is discarded. No state persists unless you explicitly write to a Durable Object or R2 bucket before the Worker exits. For session-persistent state, @cloudflare/shell with R2 backing is the recommended pattern.

Key Takeaways

Cloudflare Dynamic Workers addresses a real infrastructure problem: AI agents that execute code need sandboxes, and containers are too slow and expensive for high-frequency use. V8 isolates have spent eight years hardening for exactly this multi-tenant, untrusted-code execution scenario.

The constraints — no filesystem, no environment variables, no network by default — are not limitations to work around. They are the security model. Design your agent tools with these boundaries in mind from the start, and the Code Mode pattern becomes a clean architecture: typed tools, LLM-written orchestration code, and isolated execution with no cross-tenant leakage.

The open beta is live as of March 24, 2026, behind the Workers Paid plan. The @cloudflare/codemode (v0.3.4), @cloudflare/worker-bundler (v0.1.3), and @cloudflare/agents (v0.0.16) packages are available on npm today.

Bottom Line

Dynamic Workers is the right infrastructure for AI agents that generate JavaScript. V8 isolates boot in milliseconds at a fraction of container cost, and the Code Mode pattern built on top of it trades LLM reasoning loops for compiled programs — reducing both latency and inference spend. The hard constraint is language: if your agents generate Python or need POSIX tools, look at microVM alternatives instead.

DEV Community