We just shipped browser-act CLI — browser automation without writing code

#opensource #webdev #cli #automation

We built BrowserAct because we kept running into the same wall: every time we needed to automate something in a browser, we had to start a whole project.

npm init -y
npm install playwright
npx playwright install chromium
# ... now write 25 lines of async/await just to load a page

That's fine when you're building a test suite. Most of the time, you just want to grab a page's content, click something, or take a screenshot — from the terminal, in 30 seconds.

So we built browser-act CLI. Browser automation as terminal commands. No code, no project setup, no framework.

What it looks like

This is a real run. Three commands against Hacker News:

browser-act --session s1 navigate "https://news.ycombinator.com"
browser-act --session s1 wait stable
browser-act --session s1 get markdown

Full page extracted as clean structured markdown — 3 commands, no code written.

Output: 15,547 characters of clean markdown from 78,320 chars of raw HTML. browser-act automatically strips ads, nav bars, and irrelevant noise.

Why not just use Playwright?

Playwright's getting started page: npm init, install, then download ~400MB of browser binaries — before a single line of automation.

	Playwright / Puppeteer	browser-act CLI
First-time setup	npm init + install + ~400MB browser download	`npx skills add browser-act/skills --skill browser-act` — once, global
Navigate + extract content	~25 lines of async/await boilerplate	3 commands
Session state	Manual context management in every script	`--session` persists automatically between commands
Shell integration	Requires Node.js or Python runtime	Pipe output directly to `grep` / `jq` / anything

Playwright is still the right choice for full E2E test suites with parallel workers, trace viewers, and CI pipelines. browser-act CLI is for everything else.

Get started

Install once:

npx skills add browser-act/skills --skill browser-act

Core commands:

# Open a page and extract its content
browser-act --session s1 navigate "https://example.com"
browser-act --session s1 wait stable
browser-act --session s1 get markdown      # clean text output
browser-act --session s1 get html          # raw HTML

# Interact
browser-act --session s1 click 3           # click element by index
browser-act --session s1 input 2 "query"   # fill a field
browser-act --session s1 keys "Enter"

# Capture
browser-act --session s1 screenshot ./out.png

# Stealth mode (bypasses bot detection)
browser-act --session s1 browser list      # pick a stealth profile

Sessions persist between commands — build multi-step automations in shell scripts without managing state yourself.

What people are using it for

Web scraping — no boilerplate, just commands and output
Shell pipelines — get markdown | grep | jq — works with every Unix tool you already use
AI agents — give an LLM direct browser access via CLI commands
Deployment verification — navigate → get markdown → assert expected content
n8n / Make / Zapier integrations — use as a step in no-code workflows

DEV Community

We just shipped browser-act CLI — browser automation without writing code

What it looks like

Why not just use Playwright?

Get started

What people are using it for

Top comments (0)