- Book: AI Agents Pocket Guide: Patterns for Building Autonomous Systems with LLMs
- Also by me: Thinking in Go (2-book series) — Complete Guide to Go Programming + Hexagonal Architecture in Go
- My project: Hermes IDE | GitHub — an IDE for developers who ship with Claude Code and other AI coding tools
- Me: xgabriel.com | GitHub
Anthropic shipped Computer Use as a public beta in October 2024. The feature is still labelled beta in the official docs, and the tool definition has gone through several dated revisions since launch (consult the docs for the current computer_* identifier and the matching beta header). Across the public reports and the deployments I have watched up close, the same patterns keep surfacing once a Computer Use agent meets real traffic.
The five patterns below are the ones that keep showing up. None of them are exotic. Every one of them separates a Computer Use agent that runs unattended overnight from one you turn off the morning after launch.
Pattern 1: read-only first, full input only when you have to
The Computer Use tool exposes a screenshot action and a set of input actions: left_click, type, key, scroll, mouse_move, plus fine-grained left_mouse_down/up and hold_key. Newer tool revisions add a zoom action; check the Anthropic docs for which actions ship with the version you pin. The screenshot action is read-only. Everything else mutates state in the box you handed Claude.
The fastest way to de-risk an agent is to start without input actions at all. Build a tool wrapper that filters Claude's tool calls to only screenshot and zoom, and ship that version first to verify the model can see what you think it can see. Most production failures in the first weeks come from screenshot resolution issues, not from input mistakes. Claude clicking the wrong button is downstream of Claude misreading the screen.
from anthropic import Anthropic
READ_ONLY_ACTIONS = {"screenshot", "zoom", "wait"}
def filter_tool_call(action: str) -> bool:
return action in READ_ONLY_ACTIONS
Once the read-only loop reads cleanly across your traffic shapes, switch on input actions one bucket at a time: keyboard first, then click, then drag. Teams I have spoken with run a read-only week at staging before the input layer turns on, and that week tends to surface screenshot-scaling bugs that would have cost them clicks on the wrong widget in production.
Pattern 2: bound the loop
The Anthropic reference implementation ships a max_iterations parameter in its sample loop, and the surrounding docs call out an iteration limit as the way to prevent runaway API costs. The agent loop runs until Claude stops calling tools or until the limit fires, and there is no built-in cap on the API side.
A bounded loop is three counters and a human-in-the-loop hatch:
import time
import anthropic
client = anthropic.Anthropic()
MAX_ACTIONS = 30
MAX_SECONDS = 600
MAX_INPUT_TOKENS = 200_000
def run_session(user_prompt: str):
messages = [{"role": "user", "content": user_prompt}]
actions = 0
started = time.monotonic()
input_tokens = 0
while True:
if actions >= MAX_ACTIONS:
return pause_for_human(
messages, "action_budget_exceeded"
)
if time.monotonic() - started > MAX_SECONDS:
return pause_for_human(
messages, "wall_clock_exceeded"
)
if input_tokens > MAX_INPUT_TOKENS:
return pause_for_human(
messages, "token_budget_exceeded"
)
response = client.beta.messages.create(
model="claude-sonnet-4-5", # use the current Computer-Use-eligible model id from the Anthropic console
max_tokens=4096,
messages=messages,
tools=TOOLS,
betas=["<current-computer-use-beta-header>"], # see the Anthropic docs for the exact string
)
input_tokens += response.usage.input_tokens
messages.append(
{"role": "assistant", "content": response.content}
)
tool_results = handle_tool_calls(response)
if not tool_results:
return messages
actions += sum(
1 for r in tool_results if r["is_action"]
)
messages.append(
{"role": "user", "content": tool_results}
)
A few notes on the snippet so it actually runs in your codebase. response.content is a list of SDK content-block objects, not a JSON-serializable list of dicts; if you persist or log it, convert with [b.model_dump() for b in response.content] first. handle_tool_calls is whatever you write to map Claude's tool-use blocks into tool-result blocks for the next turn; the is_action flag is yours to set on anything that mutates state. Pin the model id and the beta header against the current Anthropic docs before shipping; both move on the Computer Use beta and the names above are placeholders.
Pick the limits to match the task. A "fill in this form" agent should never need more than 15 actions; a "research this topic across three browser tabs" agent might tolerate 60. The number does not matter. What matters is that the runaway case has a ceiling, and the ceiling fires before your invoice does. Pause the session, hand the trace to a human, let them either approve a continuation or kill it.
Pattern 3: sandbox the host
This is the one Anthropic's docs are most explicit about. From the security section:
Use a dedicated virtual machine or container with minimal privileges to prevent direct system attacks or accidents.
The reference implementation ships exactly that: a Linux Docker container running Xvfb, a window manager, and a small set of pre-installed apps. The agent loop runs inside the container; the host machine never sees Claude's clicks.
Three concrete shapes teams ship in production:
- Docker per session. Spin up a fresh container per agent run, mount nothing from the host, destroy on exit. Fast, cheap. Isolation is OK for most tasks.
- Firecracker microVMs. Stronger isolation than a container (kernel boundary, not a namespace). Higher boot cost. The right pick if the agent will visit untrusted URLs.
- Full QEMU/KVM VMs. The strongest isolation available, the slowest to start. Pick this only when you truly cannot trust the workload: adversarial form filling, payment flows, anything where a VM escape would be a "wake the CISO" event.
Whichever you pick, the rules are the same. No host filesystem mounts. No host network unless you whitelist domains (next pattern). No reuse of a sandbox across users. No persistent credentials inside the sandbox image. Credentials get injected at session start through the API, never baked in.
The shape of a per-session Docker sandbox is small:
docker run --rm \
--read-only \
--tmpfs /tmp:rw,size=512m \
--tmpfs /home/agent:rw,size=512m \
--network agent-net \
--cpus 1.0 \
--memory 1g \
--pids-limit 200 \
--security-opt no-new-privileges \
--cap-drop ALL \
agent-sandbox:latest
Read-only root, tmpfs for scratch space, capability drop, a custom network you control, hard CPU and memory caps. Every flag in there exists because someone got bitten by its absence. The --pids-limit line in particular catches a fork bomb that an adversarial page can trigger with a single iframe.
Pattern 4: action whitelist
Sandboxing the host stops Claude from reading your ~/.aws/credentials. It does not stop Claude from clicking "send" on the wrong tab inside the sandbox. The fix is at a different layer: a whitelist of domains, applications, and actions the agent is allowed to interact with for a given task.
Two enforcement points work in tandem.
First, network egress filtering at the sandbox boundary. A Squid proxy or nftables ruleset that lets the sandbox reach *.example.com and nothing else. The agent can request any URL it wants; the proxy returns a clean 403 for everything outside the list. This is what the docs mean by limit internet access to an allowlist of domains to reduce exposure to malicious content.
Second, action-level filtering before the tool call is executed. Assume tool_call.input is the dict-shaped tool input from the SDK; if you are working with the typed block, call .model_dump() first. Then inspect every tool call Claude returns and decide whether to run it:
ALLOWED_KEYS = {
"Return", "Tab", "Escape", "BackSpace",
"Up", "Down", "Left", "Right",
"ctrl+a", "ctrl+c", "ctrl+v",
}
DENIED_KEY_COMBOS = {
"ctrl+alt+t",
"ctrl+alt+f1",
"super",
}
def is_allowed(tool_call):
action = tool_call.input.get("action")
if action == "key":
text = tool_call.input.get("text", "")
if text in DENIED_KEY_COMBOS:
return False, "denied_key_combo"
if text not in ALLOWED_KEYS:
return False, "unlisted_key"
if action in {"left_click", "double_click"}:
x, y = tool_call.input.get("coordinate", (0, 0))
if not in_app_window(x, y):
return False, "click_outside_app"
return True, None
Two things land here. The keyboard whitelist stops Claude from opening a terminal window inside the desktop environment via Ctrl+Alt+T. That is useful even when bash is not in the tool list, because the underlying VM might have a terminal app installed. The coordinate check stops clicks that fall outside the app window the agent is supposed to be operating, which is a common signal that the model has lost track of where it is on screen.
Denials feed back to Claude as tool errors. The model adapts (a denied tool result reads as "that key combo was denied, try the menu instead") and the loop continues. Denied actions also feed your audit trail, which is the next pattern.
Pattern 5: audit trail per action
Every Computer Use session produces a log whether you write one or not. The screenshots, action requests, and tool results all live in your messages array for the duration of the loop. The pattern is to persist that log out-of-band, immutably, with enough metadata to replay or investigate any session after the fact.
Minimum fields per action row:
-
session_idandstep_index timestamp-
modelandtool_version(the datedcomputer_*identifier you pinned) -
action_typeand fullinputJSON - Screenshot hash (store the image in object storage, the hash in the log row)
-
allowedboolean from the whitelist check, plus the denial reason if any - Tool result text or
is_error: true - Token usage from
response.usage
import json, hashlib, time, uuid
def log_action(session_id, step, tool_input, screenshot_bytes,
result, allowed, deny_reason, usage,
model_id, tool_version):
sha = hashlib.sha256(screenshot_bytes).hexdigest()
row = {
"session_id": session_id,
"step_index": step,
"ts": time.time(),
"model": model_id, # pin to the id you used in the API call
"tool_version": tool_version, # the computer_* identifier from the docs
"action_type": tool_input.get("action"),
"input": tool_input,
"screenshot_sha256": sha,
"allowed": allowed,
"deny_reason": deny_reason,
"result_is_error": result.get("is_error", False),
"input_tokens": usage.input_tokens,
"output_tokens": usage.output_tokens,
}
audit_sink.write(json.dumps(row) + "\n")
Append-only, partition by session, retain per your compliance window. The day you need this log, you will need it badly. A user reports the agent did the wrong thing, a domain on the whitelist served unexpected content, a regulator asks how a payment got submitted. Without the trail, you have no answer. With it, you have a frame-by-frame replay of what the model saw and what it did.
The log is also where prompt-injection investigations live. Anthropic calls out that Claude instructions on webpages or contained in images may override instructions or cause Claude to make mistakes. Some teams run their own classifiers on screenshots to flag potential injections; when a flagged screenshot triggers a confirmation request, your log should record both the flag and the human decision.
Computer Use is still beta. Anthropic ships meaningful tool revisions every few months, and the newer revisions change how Pattern 1 reads small UI elements (the docs cover what each computer_* identifier adds). Pin the beta header version in your client, watch the changelog, and re-run your read-only smoke tests when you bump.
If this was useful
The AI Agents Pocket Guide walks through agent-loop patterns of the same flavour as the ones above (bounded iteration, sandbox boundaries, action-level guards, structured traces) applied across tool use, sub-agent orchestration, and recovery from partial failure. Computer Use is one shape of agent; the patterns generalise to any LLM that calls tools whose output you cannot fully trust.

Top comments (0)