atani

Posted on Mar 23 • Edited on Jul 20 • Originally published at zenn.dev

16MB vs 1.2GB — Benchmarking 5 AI Browser Automation Tools

#ai #playwright #browserautomation #webdev

I was using Claude Code for browser automation and found myself stuck choosing between tools. There are five candidates, each with a completely different approach. I installed all of them and ran 10 tests — turns out the best choice depends entirely on your use case.

TL;DR: For auth management, go with playwright-CLI. For an agent operation backbone, agent-browser. For autonomous natural-language control, browser-use. For high-volume crawling, Lightpanda. For production infrastructure, steel-browser.

This article focuses on CLI-mode comparisons. Some tools like browser-use shine brightest in LLM agent mode — I plan to cover that angle in a separate post.

The 5 Tools Compared

Tool	Stars	Language	License	Maintained by	In a nutshell
playwright-cli	6.1K (+85K core)	TypeScript	Apache-2.0	Microsoft	CLI strong on auth management and token efficiency
agent-browser	24.1K	Rust	Apache-2.0	Vercel	CLI purpose-built for agent development
browser-use	82.2K	Python	MIT	browser-use	LLM autonomously operates the browser
Lightpanda	23.6K	Zig	AGPL-3.0	lightpanda-io	Ultra-low-memory lightweight browser (official claim: 9x faster)
steel-browser	6.7K	TypeScript	Apache-2.0	steel-dev	Production-grade browser infrastructure

Test Environment

Item	Value
OS	macOS Darwin 25.3.0 (Apple Silicon)
Node.js	v24.5.0
Python	3.13
playwright-CLI	0.1.1
agent-browser	0.21.4
browser-use	0.12.2
Lightpanda	nightly (c1fc2b13)
steel-browser	Docker (latest)

Note: Results depend on hardware and network conditions. Browser startup speed and memory usage vary significantly with CPU/RAM configuration, so treat these numbers as directional guidance. All tools are under active development — results may differ with newer versions.

Setup: Lightpanda Is a Single 12MB Binary

The first hurdle is installation. Here's how each tool goes from zero to running:

Tool	Method	Dependencies	Steps
Lightpanda	Binary download (12MB)	None	1 step
agent-browser	`npm i -g agent-browser` + `install`	Node.js + Chrome download	2 steps
playwright-CLI	`npm i -g @playwright/cli` + `install-browser`	Node.js + browser download	2 steps
browser-use	`pip install browser-use` (75 deps)	Python 3.11+ + LLM API key	2 steps + config
steel-browser	`docker pull`	Docker	1 step (assumes Docker)

Lightpanda is just one binary download — zero dependencies, as simple as it gets. agent-browser and playwright-CLI are a single npm command each, but require a separate browser download. browser-use pulls in 75 Python packages and needs an LLM API key configured on top. steel-browser is one command if Docker is already running, but Docker itself is a prerequisite.

Raw Speed: steel-browser Clocks 0.45s

We opened httpbin.org/html, took a snapshot, and closed the browser — measured three times each.

Tool	Run 1	Run 2	Run 3	Average
steel-browser	0.70s	0.41s	0.25s	0.45s
Lightpanda	1.05s	0.85s	0.85s	0.92s
agent-browser	1.93s	1.80s	1.90s	1.88s
playwright-CLI	2.15s	1.84s	1.85s	1.95s
browser-use	2.30s	10.21s	2.22s	4.91s

steel-browser connects to an always-running Chromium inside Docker, making the second and third runs especially fast. Lightpanda's custom engine avoids Chrome's startup overhead and delivers consistently quick results. agent-browser and playwright-CLI both run in daemon mode with stable performance. browser-use carries Python runtime startup cost and occasionally spikes (10.21s).

Memory: 16MB vs 1.2GB — a 75x Gap

We measured process memory (RSS) with a single page (httpbin.org/html) open.

Tool	Daemon	Browser	Total	Processes
Lightpanda	—	—	16 MB	1
steel-browser	(in Docker)	(in Docker)	581 MB	container
browser-use	111 MB	758 MB	869 MB	8
playwright-CLI	169 MB	760 MB	929 MB	7
agent-browser	5 MB	1,197 MB	1,202 MB	10

Lightpanda uses 16MB — roughly 75x less than agent-browser at 1,202MB. The official benchmark claims "9x less memory than Chrome," but in our single-page test the gap was even wider. Results will vary depending on test conditions and page complexity. This is because Lightpanda's engine skips CSS rendering and focuses on DOM and JS execution. steel-browser bundles Chromium and Node.js inside a Docker container at 581MB — consolidating host processes into a container is a practical operational advantage. browser-use's Python daemon takes 111MB, but the lower Chrome process count keeps the total at 869MB. agent-browser's daemon itself is a lean 5MB Rust binary, but the many Chrome processes push the total to the highest.

SPA Support: All 5 Tools Loaded react.dev Successfully

We tested whether each tool could take a proper snapshot of react.dev (React's official site, a client-side rendered SPA).

Tool	Success	Time	Output size	Format
steel-browser	OK	1.45s	16 KB	Markdown
Lightpanda	OK	4.20s	18 KB	Markdown
agent-browser	OK	4.83s	34 KB	Accessibility tree (with ref IDs)
playwright-CLI	OK	6.22s	137 B	Snapshot file reference
browser-use	OK	12.77s	Title only	`extract` requires LLM agent mode

All five tools loaded the page successfully. steel-browser's always-on Chromium gives it the edge at 1.45s even for SPAs. Lightpanda returned 18KB of Markdown in 4.20s, handling react.dev-level SPAs just fine. agent-browser returns an accessibility tree with ref IDs, which maps directly to action commands. playwright-CLI outputs only a file reference (137 bytes) — a deliberate design for token efficiency. browser-use in CLI-only mode captures the page title; structured extraction requires LLM agent mode.

Auth Persistence: playwright-CLI's state-save/load Is the Most Complete

When an AI agent needs to operate internal tools behind SAML/SSO, being able to save and restore auth state is critical. We tested using httpbin.org's cookie endpoint.

playwright-CLI

# Set cookie → save → restore
playwright-cli open https://httpbin.org/cookies/set/saml_token/mock_abc123 --persistent
playwright-cli cookie-list
# → saml_token=mock_abc123 (domain: httpbin.org, path: /)

playwright-cli state-save /tmp/pw-auth.json
playwright-cli close

# Restore in a new session
playwright-cli open https://httpbin.org --persistent
playwright-cli state-load /tmp/pw-auth.json
playwright-cli cookie-list
# → saml_token=mock_abc123  ← restored successfully

With cookie-list/cookie-set/state-save/state-load commands and a --persistent flag that saves the browser profile to disk, playwright-CLI offers the most complete auth management of all five tools. Cookies and localStorage survive restarts.

agent-browser

agent-browser open https://httpbin.org/cookies/set/saml_token/mock_abc123
agent-browser state save test-auth
agent-browser close

agent-browser open https://httpbin.org/cookies
agent-browser state load test-auth
# → Cookie restored

Supports state save/load commands and a --profile flag for persistence. The Auth vault feature encrypts credentials, allowing agents to log in without ever seeing the password.

browser-use

cookies export/import commands handle cookie serialization. The --profile flag preserves cookies and localStorage across sessions. Comprehensive state management is achieved by combining LLM agent mode with profile persistence.

Lightpanda

Each fetch runs as an independent process — the design is optimized for high-volume public page retrieval. For authenticated access, you can pass HTTP headers or cookies as command-line arguments.

steel-browser

Cookies persist via session ID reuse. We confirmed that a saml_token=mock_abc123 cookie survived session recreation. steel-browser manages sessions through a REST API to which Playwright/Puppeteer clients connect, so the actual auth logic lives on the client side.

Auth Persistence Summary

Tool	Cookie management	State save/restore	Profile persistence
playwright-CLI	cookie-list/set/get built in	state-save/load	--persistent
agent-browser	Via state	state save/load + Auth vault	--profile
browser-use	cookies export/import	Profile-based	--profile
steel-browser	Session ID-based (verified)	REST API	Per-session
Lightpanda	Via HTTP headers	Per-fetch	Lightweight design — out of scope

Parallel Execution: Lightpanda Stays at ~1.6GB Even at 100 Concurrent Sessions

We spun up 3 sessions simultaneously and measured completion time and total memory.

Tool	3-session startup time	Total memory
playwright-CLI	4.90s	559 MB
Lightpanda	9.34s	~48 MB
agent-browser	10.58s	4,165 MB
browser-use	—	—
steel-browser	—	—

browser-use supports session management via its --session option, but we did not test it here. steel-browser can create multiple sessions concurrently through its REST API, but was also excluded from this measurement.

playwright-CLI manages parallel sessions with named sessions (-s s1, -s s2) and shares Chrome processes efficiently. Lightpanda runs each fetch as an independent process (~16MB each), so even 100 concurrent sessions would use an estimated 1.6GB (16MB x 100). agent-browser launches an independent Chrome instance per session, leading to higher memory consumption.

Error Handling: agent-browser Gives the Clearest Messages

We tested behavior on a nonexistent URL, a 404 page, and a timeout scenario.

Scenario	playwright-CLI	agent-browser	browser-use	Lightpanda	steel-browser
DNS failure	Opens page but content is empty	`✗ net::ERR_NAME_NOT_RESOLVED`	Shows URL only	HTML error (`NavigationFailed: CouldntResolveHost`)	JSON `{"message":"net::ERR_NAME_NOT_RESOLVED at ..."}`
404	Opens page but content is empty	`✗ net::ERR_HTTP_RESPONSE_CODE_FAILURE`	Shows URL only	Empty HTML	Structured JSON error
Timeout	—	—	—	Disconnects precisely at specified time (3.03s)	—

agent-browser gives the most specific error messages — you can tell at a glance what went wrong, making it straightforward to build retry logic. steel-browser returns structured JSON errors, ideal for programmatic parsing. Lightpanda returns structured HTML errors, also easy to process programmatically. browser-use shows only the URL — detailed error info requires separate investigation. playwright-CLI reflects Chrome's native behavior of attempting to open the page even on errors.

JS-Heavy Sites: steel-browser Fastest Across the Board

We tested how each tool handles sites built as SPAs (Single Page Applications).

Site	playwright-CLI	agent-browser	browser-use	Lightpanda	steel-browser
HackerNews	OK (4.3s)	OK (4.0s)	OK (4.24s)	OK (1.1s)	OK (0.92s)
GitHub	OK (2.9s)	OK (1.6s)	OK (title only)	OK (2.9s)	OK (3.45s)
react.dev	OK (6.2s)	OK (4.8s)	OK (12.8s)	OK (4.2s)	OK (1.5s)

The Chrome-based tools (playwright-CLI, agent-browser, browser-use) handle all sites reliably. steel-browser's always-on Chromium inside Docker makes it the fastest on JS-heavy sites — 0.92s for HackerNews, 1.5s for react.dev. Lightpanda performs well on server-side-rendered sites (HackerNews, GitHub) but has limitations with SPAs that rely heavily on client-side JS. This is by design — Lightpanda skips CSS rendering and focuses on DOM and JS execution. Broader SPA support is on the roadmap.

Token Efficiency: playwright-CLI's 317B Is the Smallest

When an AI agent operates a browser, how much of the LLM's context window it consumes matters. We compared snapshot output sizes for github.com/microsoft/playwright.

Tool / Format	Output size
browser-use `get text`	112 bytes
playwright-CLI snapshot	317 bytes
agent-browser snapshot -i (compact)	16 KB
steel-browser markdown	31 KB
Lightpanda markdown	36 KB
agent-browser snapshot (full)	70 KB
Lightpanda html	409 KB
browser-use `get html`	436 KB
steel-browser html	445 KB

browser-use's get text returns just 112 bytes (the page title) — the most token-efficient option for existence checks and quick verifications. However, structured data extraction requires get html (436KB) or the extract command in LLM agent mode.

playwright-CLI's 317 bytes comes from saving the snapshot payload to a file and passing only the file reference to the LLM. For coding agents handling large codebases (Claude Code, GitHub Copilot, etc.), conserving context on browser operations is essential — this design choice makes a lot of sense.

agent-browser's -i (inline compact) mode delivers 16KB while preserving the ref-numbered structure needed for interaction. steel-browser's Markdown output is 31KB, on par with Lightpanda's 36KB.

Output Quality: ref-tagged Tree vs Markdown vs JS Execution

We extracted top article titles from HackerNews to compare output quality.

agent-browser — Actionable structured data

- link "Flash-Moe: Running a 397B Parameter Model on a Mac with 48GB RAM" [ref=e111]
- link "Hormuz Minesweeper – Are you tired of winning?" [ref=e116]

Each element carries a ref ID — click e111 clicks that link. The output maps directly to agent commands.

Lightpanda — Readable Markdown

| 1. | [Flash-Moe: Running a 397B Parameter Model...](https://github.com/danveloper/flash-moe) |
| 2. | [Hormuz Minesweeper – Are you tired of winning?](https://hormuz.pythonic.ninja/) |

Markdown table format, easy for both humans to read and LLMs to parse. Includes full link URLs.

playwright-CLI — Flexible extraction via JS execution

playwright-cli run-code "const titles = await page.$$eval('.titleline > a', els => els.map(a => a.textContent));"

You can run arbitrary JS, giving maximum extraction flexibility. The tradeoff is that you need to know the page structure beforehand.

browser-use — LLM decides autonomously

browser-use's extract command operates in LLM agent mode, where the LLM interprets the page and returns structured data. No selectors needed — you can simply say "get me a list of article titles" in natural language. In CLI-only mode (get text / get html), you get raw data.

steel-browser — Complete HTML structure

steel-browser's scrape API returns full HTML with link structures intact. Built-in Markdown conversion lets you pick the output format to suit your needs.

Who Should Use What

playwright-CLI — Engineers working in enterprise SAML/SSO environments

Ideal if you want an AI agent to operate internal tools like GitHub Enterprise or Jira behind SAML auth. Cookie/state-save/load is the most feature-complete of all five tools, ensuring reliable auth state persistence. Integration with Claude Code works instantly via SKILL.md. The 317-byte snapshot output stands out for token efficiency, preserving context in coding agents dealing with large codebases. It's also the only tool supporting Firefox and WebKit, making it useful for cross-browser E2E automation.

Best for: Internal tool automation, auth-heavy workflows, Claude Code/Copilot companion tool, cross-browser testing.

agent-browser — Full-stack engineers building custom AI agents

The ideal pick if you want to design your own agent loop while keeping browser operations simple. The ref-tagged accessibility tree ([ref=e111]) from snapshot can be used directly as action instructions, so "snapshot → decide → click ref" loops write themselves naturally. Error messages are the most helpful for debugging during development. The Rust-built daemon is just 5MB. Features like the Auth vault for encrypted credential management and the diff command for verifying operations make it a rich toolkit for agent development. You can also switch to Lightpanda as a backend for a lighter footprint.

Best for: Custom AI agent development, browser automation bot prototypes, action-verification loop implementation.

browser-use — Python engineers who want the fastest path to a browser automation prototype

The only tool where "find the cheapest price on this e-commerce site" or "apply to this job posting" is achievable in a few lines of Python. The LLM looks at the page and decides what to do, so you skip the selector research and page structure analysis. DOM + screenshot dual input adapts automatically to layout changes. Supports 15+ LLM providers (Claude, GPT-4o, Gemini, Ollama, etc.), including local models. Tasks are defined in natural language, which makes it easier to discuss "what the agent should do" with non-engineering team members.

Best for: Business process automation POCs, natural-language task definitions, ad-hoc web automation, RPA-style usage of internal tools.

Lightpanda — Data engineers processing high volumes of web pages

100 pages simultaneously at 48MB of memory. Chrome would use over 4GB. That gap translates directly to infrastructure cost. Lightpanda excels at extracting data from structured sites (HackerNews, GitHub), collecting web content for RAG pipelines, and parsing large batches of URLs. A single 12MB binary with zero dependencies, and built-in MCP support for direct AI agent integration. Rich Markdown and semantic tree output makes the data easy for LLMs to digest. SPA support is expanding in upcoming releases. For static sites and structured data, it already delivers a substantial performance advantage.

The AGPL license requires source code disclosure when providing the software as a network service. Check the license terms before commercial use.

Best for: RAG web crawling, bulk URL content extraction, lightweight browser processing in CI/CD pipelines, resource-constrained environments.

steel-browser — Infra/SRE teams running browser automation in production

Built for teams that want to take Playwright or Puppeteer scripts straight to production. Fingerprint spoofing, automatic proxy rotation, CAPTCHA handling — steel-browser tackles the "walls you always hit in production" at the infrastructure layer. REST API session management fits naturally into microservice architectures. Deploy with a single Docker command, or use one-click deployment to Railway or Render. Apache-2.0 license means the self-hosted version is free to use.

Always check the target site's terms of service and robots.txt before use.

Best for: Production web scraping, data collection requiring bot detection evasion, infrastructure hardening for existing Playwright scripts, shared browser automation platforms for teams.

Decision Flowchart

Overall Summary

Aspect	playwright-CLI	agent-browser	browser-use	Lightpanda	steel-browser
Setup	Good	Good	Fair	Excellent	Fair
Speed	Good	Good	Fair	Excellent	Excellent
Memory	Fair	Fair	Fair	Excellent	Good
SPA support	Excellent	Excellent	Excellent	Good	Excellent
Auth persistence	Excellent	Excellent	Good	Fair	Good
Parallel execution	Good	Fair	—	Excellent	—
Error handling	Fair	Excellent	Fair	Good	Excellent
JS-heavy sites	Excellent	Excellent	Excellent	Fair	Excellent
Token efficiency	Excellent	Good	Good	Good	Good
Output quality	Good	Excellent	Good	Good	Good

Excellent = stands out in this aspect. Good = practical. Fair = has limitations. — = not tested.

Closing Thoughts

I personally use playwright-CLI with Claude Code in my daily workflow. The solid auth management and token efficiency fit well into my development routine.

Each of the five tools has a clearly defined sweet spot. For auth management, playwright-CLI. For agent development, agent-browser. For autonomous natural-language operation, browser-use. For high-volume crawling, Lightpanda. For production infrastructure, steel-browser. I encourage you to try the one that matches your use case.

The data in this article was measured in March 2026. This space moves fast — new tools appear monthly and existing ones evolve rapidly. It's always worth re-running benchmarks on the latest versions.

Top comments (1)

Alex Serebriakov • Apr 8

the timeout problem is what pushed me away from self-hosted chromium

snapapi.pics handles this externally — you set a timeout in the request, their infra deals with the rest

DEV Community