DEV Community

zmy
zmy

Posted on

How to Give Your AI Agent Browser Superpowers with Browser-CLI

The Problem: AI Agents Can't Use Browsers

Modern AI coding assistants like Claude Code, Cursor, and Codex are incredibly powerful — but they have a blind spot: they can't interact with web browsers.

Want your AI agent to:

  • Log into a website and scrape data? ❌ Can't do it
  • Fill out and submit forms automatically? ❌ Can't do it
  • Take screenshots of web pages? ❌ Can't do it
  • Test your web application end-to-end? ❌ Can't do it

Puppeteer and Playwright require writing JavaScript/Python scripts — something AI agents struggle to do reliably in real-world scenarios.

Introducing Browser-CLI

Browser-CLI is a command-line tool that wraps browser automation into simple CLI commands, designed specifically for AI agents to call directly.

# Navigate to a page
browser-cli navigate https://example.com

# Fill a form field
browser-cli fill "#search" "browser automation"

# Click a button
browser-cli click "button[type=submit]"

# Extract page content as JSON
browser-cli text

# Take a screenshot
browser-cli screenshot result.png

# Execute JavaScript
browser-cli eval "document.title"
Enter fullscreen mode Exit fullscreen mode

Every command returns structured JSON output, making it easy for AI agents to parse and act on the results.

Key Features

AI-First Design

  • JSON output for all commands — easy for agents to parse
  • Clear, semantic command names — agents know exactly what to call
  • Session-based isolation — multiple agents can run in parallel without conflicts

Login Persistence

Manual login once, reuse forever:

# Login manually (opens browser for you)
browser-cli login https://github.com

# Later, reuse the saved state
browser-cli --state ./github-state.json navigate https://github.com/settings
Enter fullscreen mode Exit fullscreen mode

Web Components Support

Smart-click automatically detects internal methods on custom elements:

browser-cli click "my-custom-button"
# Automatically finds and calls the internal click handler
Enter fullscreen mode Exit fullscreen mode

Zero-Code Automation

No scripts needed. Just CLI commands that any AI agent can execute:

browser-cli navigate https://news.ycombinator.com
browser-cli text
# Returns: { "title": "Hacker News", "content": "...", "url": "..." }
Enter fullscreen mode Exit fullscreen mode

Integration with AI Coding Tools

Claude Code

Just tell Claude to use browser-cli commands:

> Use browser-cli to check the latest posts on Hacker News
> browser-cli navigate https://news.ycombinator.com
> browser-cli text
Enter fullscreen mode Exit fullscreen mode

Cursor / Codex

Same approach — AI agents call browser-cli as a shell command and parse the JSON response.

Comparison with Alternatives

Feature Browser-CLI Puppeteer Playwright browser-use
AI-friendly output JSON JS API JS/Python Natural language
Zero-code usage CLI Scripts Scripts LLM-driven
Session isolation Built-in Manual Manual -
Login persistence Built-in Manual Manual -
Web Components Smart-click - - -
Language Go (single binary) Node.js Node/Python Python

Quick Start

git clone https://github.com/zmysysz/browser-cli
cd browser-cli
make build
make setup-browsers
Enter fullscreen mode Exit fullscreen mode

Then start automating:

browser-cli navigate https://example.com
browser-cli screenshot page.png
Enter fullscreen mode Exit fullscreen mode

GitHub

https://github.com/zmysysz/browser-cli

Built with Go + Playwright. Single binary, no CGO required. Open source and free.

Would love your feedback and feature requests!

Top comments (0)