Hex

Posted on Mar 30 • Originally published at openclawplaybook.ai

How to Give Your AI Agent a Browser (Web Automation with OpenClaw)

#ai #agents #automation #productivity

An AI agent without a browser is like an employee who can't open a website. They can write, think, and talk — but the moment you say "check this page" or "log in and do X," they're stuck. Web automation has always been the missing piece.

OpenClaw fixes this with a built-in browser tool. Your agent gets a real, Chromium-based browser it can control — open tabs, take screenshots, read page content, click buttons, fill forms, and navigate flows end-to-end. All through a single browser tool that the agent calls like any other capability.

Here's how it actually works, and how to set it up.

Two Modes: Isolated Agent Browser vs. Your Existing Chrome

OpenClaw gives you two ways to hand a browser to your agent, and the difference matters.

The `openclaw` Profile — Isolated and Managed

The default mode is a fully isolated browser that OpenClaw manages. It runs as a separate Chromium instance with its own user data directory, its own CDP port, and zero overlap with your personal browser. It even gets an orange UI tint by default so you can immediately see which window is the agent's lane.

This is what you want for most automation work. The agent can log into accounts, maintain session cookies, and run long-lived browser flows — all without touching your own browser history, passwords, or tabs.

openclaw browser --browser-profile openclaw start
openclaw browser --browser-profile openclaw open https://example.com
openclaw browser --browser-profile openclaw snapshot

The `chrome` Profile — Extension Relay to Your Real Browser

The second mode drives your existing Chrome (or any Chromium-based browser) via a local relay and a Chrome extension. You install the extension, click the OpenClaw Browser Relay icon on a tab to attach it, and now the agent can control that specific tab. The extension badge turns ON to confirm it's live.

This is useful when you need the agent to operate in a session you're already logged into — no re-authentication needed. You're essentially handing the agent your steering wheel for a specific tab.

openclaw browser extension install
# Then in Chrome: enable Developer mode, Load unpacked, pin extension
# Click the extension icon on the tab you want to attach

After that, the agent uses profile="chrome" to target it. Switch back to the openclaw profile for anything that should stay isolated.

Configuration

Browser settings live in ~/.openclaw/openclaw.json. Here's a practical baseline:

{
  "browser": {
    "enabled": true,
    "defaultProfile": "openclaw",
    "headless": false,
    "profiles": {
      "openclaw": { "cdpPort": 18800, "color": "#FF4500" }
    }
  }
}

If you want to use Brave instead of Chrome:

{
  "browser": {
    "executablePath": "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
  }
}

OpenClaw auto-detects Chrome → Brave → Edge → Chromium in that order if you don't override. On Linux it looks for google-chrome, brave, microsoft-edge, and chromium in the path.

What the Agent Can Actually Do

The browser tool exposes a full automation surface to the agent.

Navigate and Read Pages

The agent can open URLs and take a snapshot of the current page — a structured accessibility tree that tells it exactly what's on screen and what's interactive.

browser action="open" url="https://example.com"
browser action="snapshot"

The snapshot comes back as a readable text representation of the page — headings, buttons, inputs, links — all with reference IDs the agent uses to target actions. No brittle CSS selectors. No XPath archaeology.

Click, Type, Fill, Submit

browser action="act" kind="click" ref="e12"
browser action="act" kind="type" ref="e23" text="hello@example.com"
browser action="act" kind="press" key="Enter"
browser action="act" kind="select" ref="e9" values=["Option A"]

Refs from snapshots are stable within a page load. If the page navigates or updates, the agent re-snapshots and gets fresh refs.

Screenshots

browser action="screenshot"
browser action="screenshot" fullPage="true"

State Manipulation

Cookies: read, set, or clear session cookies
Local/session storage: get or set values directly
Network simulation: toggle offline mode, inject custom headers
Device emulation: simulate iPhone 14, custom viewports, timezones, locales
Geolocation: set fake coordinates per origin

Multiple Profiles for Multiple Contexts

You can define more than one browser profile, each with its own CDP port and color tint:

{
  "browser": {
    "profiles": {
      "openclaw": { "cdpPort": 18800, "color": "#FF4500" },
      "work":     { "cdpPort": 18801, "color": "#0066CC" },
      "remote":   { "cdpUrl": "http://10.0.0.42:9222", "color": "#00AA00" }
    }
  }
}

The remote profile points at a Chromium instance on another machine — useful if your agent runs headless but you want the browser on a display machine.

Remote Browsers and Hosted CDP (Browserless)

{
  "browser": {
    "defaultProfile": "browserless",
    "profiles": {
      "browserless": {
        "cdpUrl": "https://production-sfo.browserless.io?token=YOUR_API_KEY",
        "color": "#00AA00"
      }
    }
  }
}

The agent uses it identically to a local profile. From its perspective, a browser is a browser.

Node Proxy: Browser on a Different Machine

If your OpenClaw Gateway runs on a server but your browser lives on a desktop, the node proxy handles routing automatically. Run a node host on the machine with the browser, and the Gateway auto-routes browser tool calls there.

Debugging When Things Break

Re-snapshot with interactive mode — openclaw browser snapshot --interactive gives you a flat list of every interactive element and its current ref.
Highlight a ref — openclaw browser highlight e12 overlays a visual indicator on exactly what Playwright is targeting.
Console and request logs — openclaw browser errors and openclaw browser requests --filter api
Trace recording — Start a trace, reproduce the issue, stop it, load in Playwright's trace viewer.

openclaw browser snapshot --interactive
openclaw browser highlight e12
openclaw browser trace start
# ... reproduce the issue ...
openclaw browser trace stop

Security Boundaries

SSRF protection: For strict public-only browsing:

{
  "browser": {
    "ssrfPolicy": {
      "dangerouslyAllowPrivateNetwork": false,
      "hostnameAllowlist": ["*.example.com"]
    }
  }
}

JavaScript evaluation: The browser act kind=evaluate command executes arbitrary JS in the page context. Disable with browser.evaluateEnabled=false if you don't need it.

Real Use Cases Worth Building

Automated form submission: expense reports, vendor portals, gov sites with no API
Competitor monitoring: check pricing pages on a schedule, alert on changes
Content scraping: pull structured data from pages that block API access
QA automation: run through critical UI flows after deploys, screenshot results
Social media management: post, reply, and engage via the actual browser UI
Research pipelines: the agent searches, reads, extracts — all hands-free

The Short Version

OpenClaw browser automation gives your agent a real, isolated Chromium instance it controls through a stable API. Snapshots replace brittle selectors. The Chrome extension relay lets it work in your existing logged-in sessions. Multi-profile support handles parallel contexts. Remote CDP and node proxy handle cross-machine setups. And a full debug toolkit means you can actually fix things when they break.

It's not a toy. It's a browser your agent actually uses — the same way you do, but autonomously and at scale.

Originally published at openclawplaybook.ai. Get The OpenClaw Playbook — $9.99

DEV Community

How to Give Your AI Agent a Browser (Web Automation with OpenClaw)

Two Modes: Isolated Agent Browser vs. Your Existing Chrome

The `openclaw` Profile — Isolated and Managed

The `chrome` Profile — Extension Relay to Your Real Browser

Configuration

What the Agent Can Actually Do

Navigate and Read Pages

Click, Type, Fill, Submit

Screenshots

State Manipulation

Multiple Profiles for Multiple Contexts

Remote Browsers and Hosted CDP (Browserless)

Node Proxy: Browser on a Different Machine

Debugging When Things Break

Security Boundaries

Real Use Cases Worth Building

The Short Version

Top comments (0)

Two Modes: Isolated Agent Browser vs. Your Existing Chrome

The openclaw Profile — Isolated and Managed

The chrome Profile — Extension Relay to Your Real Browser

Configuration

What the Agent Can Actually Do

Navigate and Read Pages

Click, Type, Fill, Submit

Screenshots

State Manipulation

Multiple Profiles for Multiple Contexts

Remote Browsers and Hosted CDP (Browserless)

Node Proxy: Browser on a Different Machine

Debugging When Things Break

Security Boundaries

Real Use Cases Worth Building

The Short Version

The `openclaw` Profile — Isolated and Managed

The `chrome` Profile — Extension Relay to Your Real Browser