DEV Community

Cover image for AWS Lambda MicroVMs: Your AI Agent Can Write Code. But Should It Execute It?

AWS Lambda MicroVMs: Your AI Agent Can Write Code. But Should It Execute It?

Amazon Web Services(AWS) launched Lambda MicroVMs a new serverless compute primitive that provides VM level isolation, near-instant startup, suspend/resume, and state preservation for running untrusted or AI-generated code.

This blog will explain why do we need Lambda MicroVMs, how does it work & demo on how to build ephemeral sandboxes for safely executing the code by an AI agent.

People can also jump straight in to the Demo: Build Ephemeral Sandboxes for AI Agent Code Execution plus there is working code as well to build lambda microvm and executing agent code in the microvm

In the end shared Developer Perspective on how this changes the way we build modern application & its a must read.


  1. Architecture

  2. The Developer's Problem in 2026

  3. The trade off before Lambda MicroVMs

  4. What is Lambda MicroVM

  5. Core Concepts for Lambda MicroVMs

  6. Demo: Build Ephemeral Sandboxes for AI Agent Code Execution

  7. Cost and Sizing

  8. Developer Perspective


Architecture

architecture

The Developer's Problem in 2026

If you've ever built an application where users run code you didn't write like a data analytics platform, an interactive coding environment, a CI/CD runner, or even a coding assistant, you'll realise these type of applications have one thing in common.

They're executing code that isn't part of your application.

Code supplied by end users, generated by AI, or pulled from external sources. And they all share the same three requirements:

  1. Isolation: One user's execution can't see, affect, or access another's

  2. Speed: Environments need to start instantly because users won't wait

  3. State that survives: Real work is multi-step. A user installs a package in step 1, loads data in step 2, and builds on that in step 3. If state resets between steps, the experience is broken. And when users go idle (switching tabs, taking a break), the environment should pause not die. When they come back, everything should be exactly where they left it.

The trade off before Lambda MicroVMs

Option Isolation Startup Speed State Across Steps State Preserved on Idle (Suspend/Resume)
EC2 instance per user Strong (full VM) 30-60 seconds Yes Manual (hibernate setup, EBS backed only)
Containers (ECS/Fargate) Moderate (shared kernel) 5-10 seconds Within container lifetime No container stops, state gone
Lambda Functions Strong (Firecracker) Sub-second Stateless between invocations No execution env frozen/recycled unpredictably
Lambda MicroVMs Strong (Firecracker) Sub-second Up to 8 hours Auto suspend/resume memory + disk snapshotted
  • EC2 provides strong isolation but starts slowly and hibernation is a hack.

  • Containers are fast, but sharing a kernel with untrusted code means significant hardening work.

  • Lambda Functions are instant and isolated, but they're designed for event driven request response not multi step sessions where step 2 depends on what was installed in step 1.

  • Lambda MicroVMs give you all four: strong isolation, sub-second start, persistent state across steps, and automatic suspend/resume that preserves everything while you stop paying for compute.

What is Lambda MicroVm

Think of serverless compute environment which provides:

  • VM-level isolation: Each sandbox runs its own kernel (Firecracker) with full operating system capabilities (for e.g. installing system packages or mounting filesystems). No shared memory, no shared filesystem
  • snapshot-based rapid startup speeds
  • State persistence up to 8 hours: Memory, disk, running processes all preserved across interactions within a session.
  • Auto suspend/resume: When idle, MicroVMs suspend. When traffic returns, they resume with everything intact.
  • fine-grained control over ingress networking (port access, support for HTTP/2, gRPC, WebSockets) and egress networking (public internet access and VPC access).

Note: It's not a replacement for Lambda Functions, but a sibling designed for a different workload which also means no infrastructure to manage.

Core Concepts for Lambda MicroVMs

AWS Lambda MicroVMs uses several resource types that as a dev we need to create and manage. Understanding the following provides the foundation for building applications with MicroVMs.

MicroVM: isolated compute environment for a single tenant, user session, or job with full operating system capabilities, providing near-instant launch and resume

MicroVM Image: Defines the application environment of a MicroVM. When you create an image, Lambda builds it into a snapshot that enables near instant launch

Snapshot: A freeze of memory + disk + running processes + network connections.

Network Connectors: Controls inbound & outbound traffic for MicroVM. You associate connectors with a MicroVM at run time. Traffic can be flown to VPC with custom connector.

Auth Token: A short-lived JWE token scoped to a specific MicroVM + allowed ports + expiry. Required in X-aws-proxy-auth header for every inbound request.

How the MicroVM Image Build Works

This is the part that makes sub-second startup possible. When you create or update a MicroVM image, Lambda doesn't just store your Dockerfile it runs through the entire build, starts your app, and freezes everything in place.

Here's what happens behind the scenes:

  1. Lambda spins up a fresh VM using the base image you specified (al2023-minimal)
  2. Runs your Dockerfile instructions like installs packages, copies files, configures the environment
  3. Starts your application
  4. Optionally waits for your app to signal it's ready (HTTP 200 on the /ready hook)
  5. Takes a full Firecracker snapshot of memory, disk, running processes, network connections

The resulting snapshot becomes your image. Future MicroVMs restore directly from it instead of booting from scratch, which is why startup is nearly instantaneous.

The gotcha: Since all MicroVMs from the same image start from identical state, anything unique you create during build (UUIDs, secrets, random seeds, DB connections) gets shared across every MicroVM. The fix: don't generate unique stuff at build time. Do it in the /run lifecycle hook which runs after each individual MicroVM starts.

MicroVM Lifecycle States & Hooks [IMP to build reliable applications]

A MicroVM transitions through these states: PENDING → RUNNING → SUSPENDED → RESUME → TERMINATED

Each transition has a corresponding lifecycle hook, places where you plug in your setup or cleanup code:

Hook When it runs Use it for
/run After snapshot restore, before accepting traffic Generate per instance unique IDs, reestablish DB connections, seed random values/payload that shouldn't be shared across MicroVMs from the same image
/resume After waking from SUSPENDED state Refresh expired tokens, reconnect pools that timed out during suspend
/suspend Before memory+disk gets checkpointed Flush write buffers, gracefully close connections you don't want frozen mid handshake
/terminate Before resources are released Persist final results to S3 or external DB, send completion callbacks

Note: If your /run hook fails or times out, the MicroVM goes straight to TERMINATED without ever reaching RUNNING, your users won't see an error. Always add timeout handling and logging inside your hooks.

Demo: Build Ephemeral Sandboxes for AI Agent Code Execution

I have created a repo which has the working script to try the following demo. Choice is yours you can use the console or the script.

┌───────────────────────────────────────────────────────────────┐
│              YOUR APPLICATION                                   │
│                                                                │
│   1. create sandbox  ─── boto3 API call ──▶  Lambda Control   │
│   2. send work       ─── HTTPS POST ──────▶  MicroVM endpoint │
│   3. destroy sandbox ─── boto3 API call ──▶  Lambda Control   │
│                                                                │
└───────────────────────────────────────────────────────────────┘
                                                     │
                     ┌───────────────────────────────▼───────────┐
                     │         Firecracker MicroVMs              │
                     │                                           │
                     │  ┌──────────┐  ┌──────────┐ ┌──────────┐│
                     │  │ User A   │  │ User B   │ │ Job X    ││
                     │  │ (Python  │  │ (Node.js │ │ (CI run) ││
                     │  │  REPL)   │  │  IDE)    │ │          ││
                     │  └──────────┘  └──────────┘ └──────────┘│
                     │  Separate kernel, memory, disk per VM    │
                     └──────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

For this demo, I'm building an AI coding assistant that creates sandboxes on demand, executes generated code in them, and destroys them after. But same pattern works for any use case like web app backend, CI runner, or interactive platform.

Pre-requisites:

  • AWS account with access to a supported region (us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1)
  • Python 3.12+, strands-agents SDK installed (pip install strands-agents)
  • S3 bucket for the code artifact
  • Update aws-cli to 2.35.11

Step 1: The Sandbox Image (Dockerfile)

This defines what's available inside every sandbox. Think of it like setting up a workspace with all the tools someone might need. You don't know what they'll build but you know what they'll have access to.

FROM public.ecr.aws/lambda/microvms:al2023-minimal

# System dependencies (curl-minimal is already in the base image)
RUN microdnf install -y python3 python3-pip git && microdnf clean all

# Python packages available to AI-generated code
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Code executor
WORKDIR /sandbox
COPY executor.py .

EXPOSE 8080
CMD ["python3", "executor.py"]
Enter fullscreen mode Exit fullscreen mode

Step 2: The Executor (Runs Inside the MicroVM)

A HTTP server that receives code, runs it as a subprocess, and returns output. This is the "interface" between the outside world and the sandbox interior.

import subprocess
import os
import logging
from flask import Flask, request, jsonify

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)

WORK_DIR = "/workspace"
os.makedirs(WORK_DIR, exist_ok=True)

@app.route("/execute", methods=["POST"])
def execute():
    payload = request.json
    code = payload.get("code", "")
    timeout = payload.get("timeout", 30)

    app.logger.info(f"=== Incoming request: {len(code)} chars, timeout: {timeout}s ===")
    app.logger.info(f"Code:\n{code}")

    code_file = os.path.join(WORK_DIR, "run.py")
    with open(code_file, "w") as f:
        f.write(code)

    try:
        result = subprocess.run(
            ["python3", code_file],
            capture_output=True,
            text=True,
            timeout=timeout,
            cwd=WORK_DIR,
        )
        return jsonify({
            "stdout": result.stdout,
            "stderr": result.stderr,
            "exit_code": result.returncode,
        })
    except subprocess.TimeoutExpired:
        return jsonify({
            "stdout": "",
            "stderr": f"Execution timed out after {timeout}s",
            "exit_code": -1,
        }), 408

@app.route("/health", methods=["GET"])
def health():
    return jsonify({"status": "ready"})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)
Enter fullscreen mode Exit fullscreen mode

The security doesn't come from this code, it comes from the VM boundary around it. Even if submitted code achieves root inside the sandbox, it's root of an isolated Linux with no visibility into other MicroVMs.


Step 3: Build the MicroVM Image

Console way

Creating microvm image

you need to mention s3 bucket where zip of DOCKERFILE is present.

options

Note: If you are using hooks then you need to enable ports for these hooks in step 2 in your application code

CLI method

Package, upload, and build:

# Package
zip sandbox-code.zip Dockerfile executor.py requirements.txt

# Upload to S3
aws s3 cp sandbox-code.zip s3://my-bucket/sandbox-code.zip

# Create the MicroVM image
aws lambda-microvms create-microvm-image \
  --name python-sandbox \
  --code-artifact uri=s3://my-bucket/sandbox-code.zip \
  --base-image-arn arn:aws:lambda:us-east-1:aws:microvm-image:al2023-1 \
  --build-role-arn arn:aws:iam::123456789:role/MicroVMBuildRole \
  --region us-east-1
Enter fullscreen mode Exit fullscreen mode

image attributes

Lambda builds the container, starts Flask, confirms it's listening, and takes the Firecracker snapshot. Build logs go to CloudWatch under /aws/lambda-microvms/python-sandbox.

Once status is SUCCESSFUL, every future MicroVM launched from this image starts preinitialized.


Step 4: Wire It Into Your Application

Console

You can run microvm manually as well, configure it as per your needs and even pass payload during /run hook. For our demo i am controlling all this aspect within agent sandbox code.

creating microvm

Note: Execution role decide what the microvm can access within AWS. For the blog my role provide ADMIN priveleges but for production permission needs to be trimmed down.

microvm configuration

CLI/Code way

Here's where your application creates, uses, and destroys sandboxes. I'm using Strands Agents SDK for this demo to build the agent directly manages MicroVM lifecycle through three tool functions.

# agent.py
import boto3
import requests
from strands import Agent
from strands.tools import tool

# --- Configuration ---
REGION = "us-east-1"
IMAGE_NAME = "python-sandbox"
STACK_NAME = "microvm-sandbox-roles"

# Auto fetch ARNs from AWS
sts_client = boto3.client("sts", region_name=REGION)
cfn_client = boto3.client("cloudformation", region_name=REGION)

ACCOUNT_ID = sts_client.get_caller_identity()["Account"]
SANDBOX_IMAGE_ARN = f"arn:aws:lambda:{REGION}:{ACCOUNT_ID}:microvm-image:{IMAGE_NAME}"

stack_outputs = cfn_client.describe_stacks(StackName=STACK_NAME)["Stacks"][0]["Outputs"]
SANDBOX_ROLE_ARN = next(
    o["OutputValue"] for o in stack_outputs if o["OutputKey"] == "ExecutionRoleArn"
)

lambda_client = boto3.client("lambda-microvms", region_name=REGION)


@tool
def create_sandbox() -> dict:
    """Create an isolated MicroVM sandbox for executing code.
    Returns sandbox_id and endpoint URL."""
    response = lambda_client.run_microvm(
        imageIdentifier=SANDBOX_IMAGE_ARN,
        executionRoleArn=SANDBOX_ROLE_ARN,
        idlePolicy={
            "maxIdleDurationSeconds": 300,
            "suspendedDurationSeconds": 1800,
            "autoResumeEnabled": True,
        },
    )
    return {
        "sandbox_id": response["microvmId"],
        "endpoint": f"https://{response['endpoint']}",
    }


@tool
def run_code(sandbox_id: str, endpoint: str, code: str) -> dict:
    """Execute Python code in the isolated sandbox.
    State persists between calls files and packages carry over."""
    token_response = lambda_client.create_microvm_auth_token(
        microvmIdentifier=sandbox_id,
        expirationInMinutes=5,
        allowedPorts=[{"port": 8080}],
    )
    token = token_response["authToken"]["X-aws-proxy-auth"]

    resp = requests.post(
        f"{endpoint}/execute",
        headers={
            "X-aws-proxy-auth": token,
            "X-aws-proxy-port": "8080",
        },
        json={"code": code, "timeout": 60},
    )
    return resp.json()


@tool
def destroy_sandbox(sandbox_id: str) -> str:
    """Terminate the sandbox. All memory, disk, and processes destroyed."""
    lambda_client.terminate_microvm(microvmIdentifier=sandbox_id)
    return "Sandbox terminated. All state destroyed."


# The agent — decides what code to write based on user's question
agent = Agent(
    model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    system_prompt="""You are a data analysis assistant. When the user asks you
    to analyze data, write code, or compute anything:
    1. Create a sandbox with create_sandbox()
    2. Write Python code and execute it with run_code()
    3. You can call run_code() multiple times state persists
    4. When finished, destroy the sandbox with destroy_sandbox()
    Always run code in the sandbox. Never just describe what code would do.""",
    tools=[create_sandbox, run_code, destroy_sandbox],
)
Enter fullscreen mode Exit fullscreen mode

Note: This same pattern without an agent would be a direct boto3 call from your web backend, the three API calls are the same regardless of what's triggering them.


Step 5: See It Work (Multi-Step State Persistence)

Try these 3 prompts each one validates a different MicroVM capability:

Prompt 1: Prove it's running in an isolated VM (not locally)

You: Run code to print os.uname() and the contents of /etc/os-release
Enter fullscreen mode Exit fullscreen mode

prompt1

prompt 1 vm

This proves VM-level isolation. The response will shows Linux aarch64 Amazon Linux 2023 instead of your local macOS.

Prompt 2: Prove filesystem state persists between calls

You: Write a file /workspace/state.txt with the current timestamp

     ... wait ~70 seconds (MicroVM auto-suspends after 60s idle) ...

You: Read /workspace/state.txt and print its contents
Enter fullscreen mode Exit fullscreen mode

prompt2

Micromvm automatically suspends after maxIdleDurationSeconds

suspended microvm

When user sends prompt again it resumes instaneously.

prompt 2 resume

After suspension the microvm resumes for the above request

mivrovm resumes

MicroVM wrote a file, went idle, auto suspended (memory + disk snapshotted), and when you came back with a new request it auto resumed with the file still there.

Note: This uses agent_suspend_demo.py which has a 1 minute idle timeout (instead of the default 5 minutes) so you can see the suspend/resume cycle quickly. Run it with python agent_suspend_demo.py.

Prompt 3: Prove packages are pre installed from the snapshot

You: Create a CSV file at /workspace/sales.csv with columns: product, revenue, region for 10 sample rows. Then read it back with pandas and tell me the top product by total revenue.
Enter fullscreen mode Exit fullscreen mode

prompt 3

prompt. 3 vm

This proves snapshot based startup. Pandas is immediately available because it was installed during image build and captured in the Firecracker snapshot.

After each prompt the agent calls destroy_sandbox() proving lifecycle control (one API call, all state gone, no residual data).

Cost and Sizing

In our demo we used the default MicroVM size (2 GB / 1 vCPU). Lambda MicroVMs uses a baseline peak sizing model. You configure a baseline, and the MicroVM can burst up to 4x during peak load:

Baseline is set via the memory parameter when creating your MicroVM image.

Mimimum baseline size starts with 0.5 GB & max memory upto 8 GB

What you actually pay:

MicroVM State What You Pay
Running (executing code) Baseline compute rate per second
Burst (CPU exceeds baseline) Per second charge for resources above baseline
Suspended (idle, state snapshotted) Snapshot storage only nearly free
Terminated Nothing

For detailed pricing, see Lambda MicroVMs pricing.

Developer Perspective

In this blog we saw how AWS Lambda MicroVMs is solving the fundamental problem of modern applications; WHERE DOES THAT UNTRUSTED CODE EXECUTE SAFELY?

With the rise of coding assistants, autonomous agents, and platforms that let users run arbitrary scripts this problem isn't going away. It's getting worse. Lambda MicroVMs gives that answer without making you build the isolation layer yourself.

The scary cases haven't gone away, users discovering /proc, AI agents executing rm -rf /, malicious packages attempting container escapes. The difference is these are now AWS's problem. The blast radius is a disposable VM that you terminate with one API call.

The fastest way to build with Lambda MicroVMs with best practices would be to use the skill provided by lambda team wi to your agent like Kiro or Claude

npx skills add https://github.com/aws/agent-toolkit-for-aws/tree/main/skills/specialized-skills/serverless-skills/aws-lambda-microvms --yes --global
Enter fullscreen mode Exit fullscreen mode

Let me know what you think in the comments and if you build something with Lambda MicroVMs, I'd genuinely love to see it.

I share such amazing AWS updates on DevOps, Kubernetes and GenAI daily over Linkedin, X. Follow me over there so that I can make your life more easy.

Top comments (1)

Collapse
 
simrat_chamkaur_753587d24 profile image
Simrat Chamkaur

Very insightful. However, what would be the cost structure for using Lambda microVM?