An AI agent without a browser is like an employee who can't open a website. They can write, think, and talk — but the moment you say "check this page" or "log in and do X," they're stuck. Web automation has always been the missing piece.
OpenClaw fixes this with a built-in browser tool. Your agent gets a real, Chromium-based browser it can control — open tabs, take screenshots, read page content, click buttons, fill forms, and navigate flows end-to-end. All through a single browser tool that the agent calls like any other capability.
Here's how it actually works, and how to set it up.
Two Modes: Isolated Agent Browser vs. Your Existing Chrome
OpenClaw gives you two ways to hand a browser to your agent, and the difference matters.
The openclaw Profile — Isolated and Managed
The default mode is a fully isolated browser that OpenClaw manages. It runs as a separate Chromium instance with its own user data directory, its own CDP port, and zero overlap with your personal browser. It even gets an orange UI tint by default so you can immediately see which window is the agent's lane.
This is what you want for most automation work. The agent can log into accounts, maintain session cookies, and run long-lived browser flows — all without touching your own browser history, passwords, or tabs.
openclaw browser --browser-profile openclaw start
openclaw browser --browser-profile openclaw open https://example.com
openclaw browser --browser-profile openclaw snapshot
The chrome Profile — Extension Relay to Your Real Browser
The second mode drives your existing Chrome (or any Chromium-based browser) via a local relay and a Chrome extension. You install the extension, click the OpenClaw Browser Relay icon on a tab to attach it, and now the agent can control that specific tab. The extension badge turns ON to confirm it's live.
This is useful when you need the agent to operate in a session you're already logged into — no re-authentication needed. You're essentially handing the agent your steering wheel for a specific tab.
openclaw browser extension install
# Then in Chrome: enable Developer mode, Load unpacked, pin extension
# Click the extension icon on the tab you want to attach
After that, the agent uses profile="chrome" to target it. Switch back to the openclaw profile for anything that should stay isolated.
Configuration
Browser settings live in ~/.openclaw/openclaw.json. Here's a practical baseline:
{
"browser": {
"enabled": true,
"defaultProfile": "openclaw",
"headless": false,
"profiles": {
"openclaw": { "cdpPort": 18800, "color": "#FF4500" }
}
}
}
If you want to use Brave instead of Chrome:
{
"browser": {
"executablePath": "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
}
}
OpenClaw auto-detects Chrome → Brave → Edge → Chromium in that order if you don't override. On Linux it looks for google-chrome, brave, microsoft-edge, and chromium in the path.
What the Agent Can Actually Do
The browser tool exposes a full automation surface to the agent.
Navigate and Read Pages
The agent can open URLs and take a snapshot of the current page — a structured accessibility tree that tells it exactly what's on screen and what's interactive.
browser action="open" url="https://example.com"
browser action="snapshot"
The snapshot comes back as a readable text representation of the page — headings, buttons, inputs, links — all with reference IDs the agent uses to target actions. No brittle CSS selectors. No XPath archaeology.
Click, Type, Fill, Submit
browser action="act" kind="click" ref="e12"
browser action="act" kind="type" ref="e23" text="hello@example.com"
browser action="act" kind="press" key="Enter"
browser action="act" kind="select" ref="e9" values=["Option A"]
Refs from snapshots are stable within a page load. If the page navigates or updates, the agent re-snapshots and gets fresh refs.
Screenshots
browser action="screenshot"
browser action="screenshot" fullPage="true"
State Manipulation
- Cookies: read, set, or clear session cookies
- Local/session storage: get or set values directly
- Network simulation: toggle offline mode, inject custom headers
- Device emulation: simulate iPhone 14, custom viewports, timezones, locales
- Geolocation: set fake coordinates per origin
Multiple Profiles for Multiple Contexts
You can define more than one browser profile, each with its own CDP port and color tint:
{
"browser": {
"profiles": {
"openclaw": { "cdpPort": 18800, "color": "#FF4500" },
"work": { "cdpPort": 18801, "color": "#0066CC" },
"remote": { "cdpUrl": "http://10.0.0.42:9222", "color": "#00AA00" }
}
}
}
The remote profile points at a Chromium instance on another machine — useful if your agent runs headless but you want the browser on a display machine.
Remote Browsers and Hosted CDP (Browserless)
{
"browser": {
"defaultProfile": "browserless",
"profiles": {
"browserless": {
"cdpUrl": "https://production-sfo.browserless.io?token=YOUR_API_KEY",
"color": "#00AA00"
}
}
}
}
The agent uses it identically to a local profile. From its perspective, a browser is a browser.
Node Proxy: Browser on a Different Machine
If your OpenClaw Gateway runs on a server but your browser lives on a desktop, the node proxy handles routing automatically. Run a node host on the machine with the browser, and the Gateway auto-routes browser tool calls there.
Debugging When Things Break
-
Re-snapshot with interactive mode —
openclaw browser snapshot --interactivegives you a flat list of every interactive element and its current ref. -
Highlight a ref —
openclaw browser highlight e12overlays a visual indicator on exactly what Playwright is targeting. -
Console and request logs —
openclaw browser errorsandopenclaw browser requests --filter api - Trace recording — Start a trace, reproduce the issue, stop it, load in Playwright's trace viewer.
openclaw browser snapshot --interactive
openclaw browser highlight e12
openclaw browser trace start
# ... reproduce the issue ...
openclaw browser trace stop
Security Boundaries
SSRF protection: For strict public-only browsing:
{
"browser": {
"ssrfPolicy": {
"dangerouslyAllowPrivateNetwork": false,
"hostnameAllowlist": ["*.example.com"]
}
}
}
JavaScript evaluation: The browser act kind=evaluate command executes arbitrary JS in the page context. Disable with browser.evaluateEnabled=false if you don't need it.
Real Use Cases Worth Building
- Automated form submission: expense reports, vendor portals, gov sites with no API
- Competitor monitoring: check pricing pages on a schedule, alert on changes
- Content scraping: pull structured data from pages that block API access
- QA automation: run through critical UI flows after deploys, screenshot results
- Social media management: post, reply, and engage via the actual browser UI
- Research pipelines: the agent searches, reads, extracts — all hands-free
The Short Version
OpenClaw browser automation gives your agent a real, isolated Chromium instance it controls through a stable API. Snapshots replace brittle selectors. The Chrome extension relay lets it work in your existing logged-in sessions. Multi-profile support handles parallel contexts. Remote CDP and node proxy handle cross-machine setups. And a full debug toolkit means you can actually fix things when they break.
It's not a toy. It's a browser your agent actually uses — the same way you do, but autonomously and at scale.
Originally published at openclawplaybook.ai. Get The OpenClaw Playbook — $9.99
Top comments (0)