Jatin Mehrotra for AWS Community Builders

Posted on Jun 26

AWS Lambda MicroVMs: Your AI Agent Can Write Code. But Should It Execute It?

#ai #aws #agents #serverless

Amazon Web Services(AWS) launched Lambda MicroVMs a new serverless compute primitive that provides VM level isolation, near-instant startup, suspend/resume, and state preservation for running untrusted or AI-generated code.

This blog will explain why do we need Lambda MicroVMs, how does it work & demo on how to build ephemeral sandboxes for safely executing the code by an AI agent.

People can also jump straight in to the Demo: Build Ephemeral Sandboxes for AI Agent Code Execution plus there is working code as well to build lambda microvm and executing agent code in the microvm

In the end shared Developer Perspective on how this changes the way we build modern application & its a must read.

Architecture
The Developer's Problem in 2026
The trade off before Lambda MicroVMs
What is Lambda MicroVM
Core Concepts for Lambda MicroVMs
Demo: Build Ephemeral Sandboxes for AI Agent Code Execution
Cost and Sizing
Developer Perspective

Architecture

The Developer's Problem in 2026

If you've ever built an application where users run code you didn't write like a data analytics platform, an interactive coding environment, a CI/CD runner, or even a coding assistant, you'll realise these type of applications have one thing in common.

They're executing code that isn't part of your application.

Code supplied by end users, generated by AI, or pulled from external sources. And they all share the same three requirements:

Isolation: One user's execution can't see, affect, or access another's
Speed: Environments need to start instantly because users won't wait
State that survives: Real work is multi-step. A user installs a package in step 1, loads data in step 2, and builds on that in step 3. If state resets between steps, the experience is broken. And when users go idle (switching tabs, taking a break), the environment should pause not die. When they come back, everything should be exactly where they left it.

The trade off before Lambda MicroVMs

Option	Isolation	Startup Speed	State Across Steps	State Preserved on Idle (Suspend/Resume)
EC2 instance per user	Strong (full VM)	30-60 seconds	Yes	Manual (hibernate setup, EBS backed only)
Containers (ECS/Fargate)	Moderate (shared kernel)	5-10 seconds	Within container lifetime	No container stops, state gone
Lambda Functions	Strong (Firecracker)	Sub-second	Stateless between invocations	No execution env frozen/recycled unpredictably
Lambda MicroVMs	Strong (Firecracker)	Sub-second	Up to 8 hours	Auto suspend/resume memory + disk snapshotted

EC2 provides strong isolation but starts slowly and hibernation is a hack.
Containers are fast, but sharing a kernel with untrusted code means significant hardening work.
Lambda Functions are instant and isolated, but they're designed for event driven request response not multi step sessions where step 2 depends on what was installed in step 1.
Lambda MicroVMs give you all four: strong isolation, sub-second start, persistent state across steps, and automatic suspend/resume that preserves everything while you stop paying for compute.

What is Lambda MicroVm

Think of serverless compute environment which provides:

VM-level isolation: Each sandbox runs its own kernel (Firecracker) with full operating system capabilities (for e.g. installing system packages or mounting filesystems). No shared memory, no shared filesystem
snapshot-based rapid startup speeds
State persistence up to 8 hours: Memory, disk, running processes all preserved across interactions within a session.
Auto suspend/resume: When idle, MicroVMs suspend. When traffic returns, they resume with everything intact.
fine-grained control over ingress networking (port access, support for HTTP/2, gRPC, WebSockets) and egress networking (public internet access and VPC access).

Note: It's not a replacement for Lambda Functions, but a sibling designed for a different workload which also means no infrastructure to manage.

Core Concepts for Lambda MicroVMs

AWS Lambda MicroVMs uses several resource types that as a dev we need to create and manage. Understanding the following provides the foundation for building applications with MicroVMs.

MicroVM: isolated compute environment for a single tenant, user session, or job with full operating system capabilities, providing near-instant launch and resume

MicroVM Image: Defines the application environment of a MicroVM. When you create an image, Lambda builds it into a snapshot that enables near instant launch

Snapshot: A freeze of memory + disk + running processes + network connections.

Network Connectors: Controls inbound & outbound traffic for MicroVM. You associate connectors with a MicroVM at run time. Traffic can be flown to VPC with custom connector.

Auth Token: A short-lived JWE token scoped to a specific MicroVM + allowed ports + expiry. Required in X-aws-proxy-auth header for every inbound request.

How the MicroVM Image Build Works

This is the part that makes sub-second startup possible. When you create or update a MicroVM image, Lambda doesn't just store your Dockerfile it runs through the entire build, starts your app, and freezes everything in place.

Here's what happens behind the scenes:

Lambda spins up a fresh VM using the base image you specified (al2023-minimal)
Runs your Dockerfile instructions like installs packages, copies files, configures the environment
Starts your application
Optionally waits for your app to signal it's ready (HTTP 200 on the /ready hook)
Takes a full Firecracker snapshot of memory, disk, running processes, network connections

The resulting snapshot becomes your image. Future MicroVMs restore directly from it instead of booting from scratch, which is why startup is nearly instantaneous.

The gotcha: Since all MicroVMs from the same image start from identical state, anything unique you create during build (UUIDs, secrets, random seeds, DB connections) gets shared across every MicroVM. The fix: don't generate unique stuff at build time. Do it in the /run lifecycle hook which runs after each individual MicroVM starts.

MicroVM Lifecycle States & Hooks [IMP to build reliable applications]

A MicroVM transitions through these states: PENDING → RUNNING → SUSPENDED → RESUME → TERMINATED

Each transition has a corresponding lifecycle hook, places where you plug in your setup or cleanup code:

Hook	When it runs	Use it for
`/run`	After snapshot restore, before accepting traffic	Generate per instance unique IDs, reestablish DB connections, seed random values/payload that shouldn't be shared across MicroVMs from the same image
`/resume`	After waking from SUSPENDED state	Refresh expired tokens, reconnect pools that timed out during suspend
`/suspend`	Before memory+disk gets checkpointed	Flush write buffers, gracefully close connections you don't want frozen mid handshake
`/terminate`	Before resources are released	Persist final results to S3 or external DB, send completion callbacks

Note: If your /run hook fails or times out, the MicroVM goes straight to TERMINATED without ever reaching RUNNING, your users won't see an error. Always add timeout handling and logging inside your hooks.

Demo: Build Ephemeral Sandboxes for AI Agent Code Execution

I have created a repo which has the working script to try the following demo. Choice is yours you can use the console or the script.

┌───────────────────────────────────────────────────────────────┐
│              YOUR APPLICATION                                   │
│                                                                │
│   1. create sandbox  ─── boto3 API call ──▶  Lambda Control   │
│   2. send work       ─── HTTPS POST ──────▶  MicroVM endpoint │
│   3. destroy sandbox ─── boto3 API call ──▶  Lambda Control   │
│                                                                │
└───────────────────────────────────────────────────────────────┘
                                                     │
                     ┌───────────────────────────────▼───────────┐
                     │         Firecracker MicroVMs              │
                     │                                           │
                     │  ┌──────────┐  ┌──────────┐ ┌──────────┐│
                     │  │ User A   │  │ User B   │ │ Job X    ││
                     │  │ (Python  │  │ (Node.js │ │ (CI run) ││
                     │  │  REPL)   │  │  IDE)    │ │          ││
                     │  └──────────┘  └──────────┘ └──────────┘│
                     │  Separate kernel, memory, disk per VM    │
                     └──────────────────────────────────────────┘

For this demo, I'm building an AI coding assistant that creates sandboxes on demand, executes generated code in them, and destroys them after. But same pattern works for any use case like web app backend, CI runner, or interactive platform.

Pre-requisites:

AWS account with access to a supported region (us-east-1, us-east-2, us-west-2, ap-northeast-1, eu-west-1)
Python 3.12+, strands-agents SDK installed (pip install strands-agents)
S3 bucket for the code artifact
Update aws-cli to 2.35.11

Step 1: The Sandbox Image (Dockerfile)

This defines what's available inside every sandbox. Think of it like setting up a workspace with all the tools someone might need. You don't know what they'll build but you know what they'll have access to.

FROM public.ecr.aws/lambda/microvms:al2023-minimal

# System dependencies (curl-minimal is already in the base image)
RUN microdnf install -y python3 python3-pip git && microdnf clean all

# Python packages available to AI-generated code
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Code executor
WORKDIR /sandbox
COPY executor.py .

EXPOSE 8080
CMD ["python3", "executor.py"]

Step 2: The Executor (Runs Inside the MicroVM)

A HTTP server that receives code, runs it as a subprocess, and returns output. This is the "interface" between the outside world and the sandbox interior.

import subprocess
import os
import logging
from flask import Flask, request, jsonify

app = Flask(__name__)
logging.basicConfig(level=logging.INFO)

WORK_DIR = "/workspace"
os.makedirs(WORK_DIR, exist_ok=True)

@app.route("/execute", methods=["POST"])
def execute():
    payload = request.json
    code = payload.get("code", "")
    timeout = payload.get("timeout", 30)

    app.logger.info(f"=== Incoming request: {len(code)} chars, timeout: {timeout}s ===")
    app.logger.info(f"Code:\n{code}")

    code_file = os.path.join(WORK_DIR, "run.py")
    with open(code_file, "w") as f:
        f.write(code)

    try:
        result = subprocess.run(
            ["python3", code_file],
            capture_output=True,
            text=True,
            timeout=timeout,
            cwd=WORK_DIR,
        )
        return jsonify({
            "stdout": result.stdout,
            "stderr": result.stderr,
            "exit_code": result.returncode,
        })
    except subprocess.TimeoutExpired:
        return jsonify({
            "stdout": "",
            "stderr": f"Execution timed out after {timeout}s",
            "exit_code": -1,
        }), 408

@app.route("/health", methods=["GET"])
def health():
    return jsonify({"status": "ready"})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

The security doesn't come from this code, it comes from the VM boundary around it. Even if submitted code achieves root inside the sandbox, it's root of an isolated Linux with no visibility into other MicroVMs.

Step 3: Build the MicroVM Image

Console way

you need to mention s3 bucket where zip of DOCKERFILE is present.

Note: If you are using hooks then you need to enable ports for these hooks in step 2 in your application code

CLI method

Package, upload, and build:

# Package
zip sandbox-code.zip Dockerfile executor.py requirements.txt

# Upload to S3
aws s3 cp sandbox-code.zip s3://my-bucket/sandbox-code.zip

# Create the MicroVM image
aws lambda-microvms create-microvm-image \
  --name python-sandbox \
  --code-artifact uri=s3://my-bucket/sandbox-code.zip \
  --base-image-arn arn:aws:lambda:us-east-1:aws:microvm-image:al2023-1 \
  --build-role-arn arn:aws:iam::123456789:role/MicroVMBuildRole \
  --region us-east-1

Lambda builds the container, starts Flask, confirms it's listening, and takes the Firecracker snapshot. Build logs go to CloudWatch under /aws/lambda-microvms/python-sandbox.

Once status is SUCCESSFUL, every future MicroVM launched from this image starts preinitialized.

Step 4: Wire It Into Your Application

Console

You can run microvm manually as well, configure it as per your needs and even pass payload during /run hook. For our demo i am controlling all this aspect within agent sandbox code.

Note: Execution role decide what the microvm can access within AWS. For the blog my role provide ADMIN priveleges but for production permission needs to be trimmed down.

CLI/Code way

Here's where your application creates, uses, and destroys sandboxes. I'm using Strands Agents SDK for this demo to build the agent directly manages MicroVM lifecycle through three tool functions.

# agent.py
import boto3
import requests
from strands import Agent
from strands.tools import tool

# --- Configuration ---
REGION = "us-east-1"
IMAGE_NAME = "python-sandbox"
STACK_NAME = "microvm-sandbox-roles"

# Auto fetch ARNs from AWS
sts_client = boto3.client("sts", region_name=REGION)
cfn_client = boto3.client("cloudformation", region_name=REGION)

ACCOUNT_ID = sts_client.get_caller_identity()["Account"]
SANDBOX_IMAGE_ARN = f"arn:aws:lambda:{REGION}:{ACCOUNT_ID}:microvm-image:{IMAGE_NAME}"

stack_outputs = cfn_client.describe_stacks(StackName=STACK_NAME)["Stacks"][0]["Outputs"]
SANDBOX_ROLE_ARN = next(
    o["OutputValue"] for o in stack_outputs if o["OutputKey"] == "ExecutionRoleArn"
)

lambda_client = boto3.client("lambda-microvms", region_name=REGION)


@tool
def create_sandbox() -> dict:
    """Create an isolated MicroVM sandbox for executing code.
    Returns sandbox_id and endpoint URL."""
    response = lambda_client.run_microvm(
        imageIdentifier=SANDBOX_IMAGE_ARN,
        executionRoleArn=SANDBOX_ROLE_ARN,
        idlePolicy={
            "maxIdleDurationSeconds": 300,
            "suspendedDurationSeconds": 1800,
            "autoResumeEnabled": True,
        },
    )
    return {
        "sandbox_id": response["microvmId"],
        "endpoint": f"https://{response['endpoint']}",
    }


@tool
def run_code(sandbox_id: str, endpoint: str, code: str) -> dict:
    """Execute Python code in the isolated sandbox.
    State persists between calls files and packages carry over."""
    token_response = lambda_client.create_microvm_auth_token(
        microvmIdentifier=sandbox_id,
        expirationInMinutes=5,
        allowedPorts=[{"port": 8080}],
    )
    token = token_response["authToken"]["X-aws-proxy-auth"]

    resp = requests.post(
        f"{endpoint}/execute",
        headers={
            "X-aws-proxy-auth": token,
            "X-aws-proxy-port": "8080",
        },
        json={"code": code, "timeout": 60},
    )
    return resp.json()


@tool
def destroy_sandbox(sandbox_id: str) -> str:
    """Terminate the sandbox. All memory, disk, and processes destroyed."""
    lambda_client.terminate_microvm(microvmIdentifier=sandbox_id)
    return "Sandbox terminated. All state destroyed."


# The agent — decides what code to write based on user's question
agent = Agent(
    model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    system_prompt="""You are a data analysis assistant. When the user asks you
    to analyze data, write code, or compute anything:
    1. Create a sandbox with create_sandbox()
    2. Write Python code and execute it with run_code()
    3. You can call run_code() multiple times state persists
    4. When finished, destroy the sandbox with destroy_sandbox()
    Always run code in the sandbox. Never just describe what code would do.""",
    tools=[create_sandbox, run_code, destroy_sandbox],
)

Note: This same pattern without an agent would be a direct boto3 call from your web backend, the three API calls are the same regardless of what's triggering them.

Step 5: See It Work (Multi-Step State Persistence)

Try these 3 prompts each one validates a different MicroVM capability:

Prompt 1: Prove it's running in an isolated VM (not locally)

You: Run code to print os.uname() and the contents of /etc/os-release

This proves VM-level isolation. The response will shows Linux aarch64 Amazon Linux 2023 instead of your local macOS.

Prompt 2: Prove filesystem state persists between calls

You: Write a file /workspace/state.txt with the current timestamp

     ... wait ~70 seconds (MicroVM auto-suspends after 60s idle) ...

You: Read /workspace/state.txt and print its contents

Micromvm automatically suspends after maxIdleDurationSeconds

When user sends prompt again it resumes instaneously.

After suspension the microvm resumes for the above request

MicroVM wrote a file, went idle, auto suspended (memory + disk snapshotted), and when you came back with a new request it auto resumed with the file still there.

Note: This uses agent_suspend_demo.py which has a 1 minute idle timeout (instead of the default 5 minutes) so you can see the suspend/resume cycle quickly. Run it with python agent_suspend_demo.py.

Prompt 3: Prove packages are pre installed from the snapshot

You: Create a CSV file at /workspace/sales.csv with columns: product, revenue, region for 10 sample rows. Then read it back with pandas and tell me the top product by total revenue.

This proves snapshot based startup. Pandas is immediately available because it was installed during image build and captured in the Firecracker snapshot.

After each prompt the agent calls destroy_sandbox() proving lifecycle control (one API call, all state gone, no residual data).

Cost and Sizing

In our demo we used the default MicroVM size (2 GB / 1 vCPU). Lambda MicroVMs uses a baseline peak sizing model. You configure a baseline, and the MicroVM can burst up to 4x during peak load:

Baseline is set via the memory parameter when creating your MicroVM image.

Mimimum baseline size starts with 0.5 GB & max memory upto 8 GB

What you actually pay:

MicroVM State	What You Pay
Running (executing code)	Baseline compute rate per second
Burst (CPU exceeds baseline)	Per second charge for resources above baseline
Suspended (idle, state snapshotted)	Snapshot storage only nearly free
Terminated	Nothing

For detailed pricing, see Lambda MicroVMs pricing.

Developer Perspective

In this blog we saw how AWS Lambda MicroVMs is solving the fundamental problem of modern applications; WHERE DOES THAT UNTRUSTED CODE EXECUTE SAFELY?

With the rise of coding assistants, autonomous agents, and platforms that let users run arbitrary scripts this problem isn't going away. It's getting worse. Lambda MicroVMs gives that answer without making you build the isolation layer yourself.

The scary cases haven't gone away, users discovering /proc, AI agents executing rm -rf /, malicious packages attempting container escapes. The difference is these are now AWS's problem. The blast radius is a disposable VM that you terminate with one API call.

The fastest way to build with Lambda MicroVMs with best practices would be to use the skill provided by lambda team wi to your agent like Kiro or Claude

npx skills add https://github.com/aws/agent-toolkit-for-aws/tree/main/skills/specialized-skills/serverless-skills/aws-lambda-microvms --yes --global

Let me know what you think in the comments and if you build something with Lambda MicroVMs, I'd genuinely love to see it.

I share such amazing AWS updates on DevOps, Kubernetes and GenAI daily over Linkedin, X. Follow me over there so that I can make your life more easy.

Top comments (1)

Simrat Chamkaur • Jun 26

Very insightful. However, what would be the cost structure for using Lambda microVM?