Introduction
"AI consumes intelligence at generation time. At execution time, it consumes nothing."
This is article No.74 in the "One Open Source Project a Day" series. Today's project is OpenCLI (GitHub).
The dominant approach to having AI Agents interact with browsers today is to let the LLM analyze the DOM or a screenshot in real time, then decide what to click. This has two fundamental problems: every execution burns a large number of tokens, and the results are unstable — the same operation might succeed today and fail tomorrow because the page changed slightly.
OpenCLI takes a completely different approach: generate first, execute later. Use AI once to generate a deterministic CLI Adapter for a website — then every subsequent execution of that Adapter requires zero LLM calls. Cost is zero. Stability is 100%.
Even more importantly, OpenCLI connects to the user's already-logged-in Chrome session via a Browser Bridge Extension. Account credentials never leave the browser.
15.6k Stars, from Apache Arrow/DataFusion PMC member jackwener — one of the most important open-source projects in the AI Agent infrastructure space right now.
What You'll Learn
- OpenCLI's core philosophy: the "compile-time intelligence vs runtime intelligence" system design principle
- How CDP (Chrome DevTools Protocol) drives a real Chrome session
- The Adapter lifecycle: explore → synthesize → generate → cascade validation
- Self-Repair Protocol: automatic Adapter repair mechanism
- Using 91 built-in Adapters + generating new ones for any website
Prerequisites
- Basic CLI tool usage
- Basic TypeScript / Node.js knowledge (optional, for reading source code)
- Familiarity with AI Agent concepts
Project Background
What Is It?
OpenCLI's full positioning: convert websites, browser sessions, Electron apps, and local tools into deterministic CLI interfaces, serving both human users and AI Agents.
Three key words in that sentence:
- Deterministic: Adapter execution results are predictable — no LLM randomness
- CLI interface: Any AI Agent can invoke it via standard shell commands
- Human + AI dual scenario: Works as a daily command-line utility and as a tool layer for AI Agents
The project's philosophy comes from a core concept in database engineering — query optimization: consume computational resources at compile time (query planning) to make optimal decisions, then execute efficiently at runtime following the plan, without making expensive decisions during execution. OpenCLI transplants this thinking into Web automation.
About the Author
- Author: jackwener (real name: jakevin)
- Location: Hangzhou, China
- GitHub Followers: 2,200+
- PMC Member & Committer: Apache Arrow, Apache DataFusion, Apache Doris
- Work History: MegaETH, SelectDB (Chinese ClickHouse vendor), ByteDance RDS, NebulaGraph
- Technical Expertise: Database query engines, Rust, Java, Go, Python, C++
- Related Projects: opencode-ios (iOS AI coding assistant)
jackwener is a senior database / infrastructure engineer. OpenCLI is his crossover into AI Agent toolchain territory, and the system design thinking is clearly shaped by database engineering — "deterministic execution" and "compile once, run many times" are classic database optimization concepts.
Project Stats
- ⭐ GitHub Stars: 15,600+
- 🍴 Forks: 1,500+
- 🐛 Open Issues: 39
- 🔀 Open PRs: 49
- 📝 Total Commits: 845
- 📦 Latest Version: v1.7.0 (April 11, 2026)
- 📄 License: Apache 2.0
- 🔌 Built-in Adapters: 91
Key Features
Core Purpose
AI Agent interactions with the Web face two fundamental tensions:
Problem 1: Runtime cost
Traditional approach (Browser Use / Stagehand / etc.):
Execute task → LLM analyzes DOM → LLM decides click target →
LLM validates result → LLM decides next step...
Every execution burns large tokens. 100 executions = 100 LLM calls.
OpenCLI approach:
[Once] Generate Adapter (LLM consumed) → written to .js file
[Many times] Execute Adapter (zero LLM, pure deterministic JS)
100 executions = 1 LLM call
Problem 2: Account security
Traditional scraping / automation:
Browser controller needs Cookie / password
→ credentials exposed to code → security risk
OpenCLI approach:
Browser Bridge Extension connects to the user's running Chrome
→ reuses already-logged-in sessions
→ credentials never leave the browser
→ looks identical to normal user activity
Use Cases
-
Stable tool layer for AI Agents
- Provide Claude Code, Codex, and other AI tools with a reliable Web operation interface — replacing the unstable "screenshot and ask LLM" pattern
-
Daily command-line productivity
-
opencli bilibili trending | head -10— real-time Bilibili trending -
opencli twitter search "AI agent" --format csv > output.csv— export search results
-
-
Private website automation
- Generate CLI interfaces for internal company tools or personal frequently-used sites; automate data extraction and operations
-
Electron desktop app control
- Drive Cursor, Notion, Discord, ChatGPT Desktop, and other Electron apps via CDP for automation
-
CI/CD data collection
- Standard Unix exit codes make OpenCLI seamlessly integrable into CI/CD pipelines for automated competitor monitoring, metrics collection, etc.
Quick Start
# 1. Install OpenCLI globally
npm install -g @jackwener/opencli
# 2. Clone and load the Browser Bridge Chrome Extension
git clone https://github.com/jackwener/OpenCLI.git
# Open chrome://extensions
# Enable "Developer mode"
# Click "Load unpacked" → select the extension/ directory
# 3. Verify connection
opencli doctor
# 4. Use a built-in Adapter immediately
opencli bilibili trending --format json
opencli bilibili search "Rust tutorial" --limit 20
opencli browser screenshot --url https://github.com
# 5. Install Skills for AI Agents
npx skills add jackwener/opencli
Built-in Adapter Coverage
91 built-in Adapters covering major platforms:
| Category | Platforms |
|---|---|
| Video | Bilibili, YouTube |
| Social | Twitter/X, GitHub |
| Search | Bing, DuckDuckGo |
| News | Hacker News, Product Hunt |
| Shopping | Multiple platforms |
| Local tools | Obsidian, Docker (via CLI Hub) |
Three Operating Modes
Mode 1: Direct execution with built-in Adapters
# Get Bilibili trending, JSON output
opencli bilibili trending --format json
# Search GitHub repositories
opencli github search "react hooks" --sort stars --limit 20
# Output formats: json / csv / yaml / markdown / table
opencli hacker-news top --format table
Mode 2: Real-time browser control
# Screenshot
opencli browser screenshot --url https://example.com --output ./screenshot.png
# Click an element
opencli browser click --selector "#submit-button"
# Type text
opencli browser type --selector "input[name=search]" --text "OpenCLI"
# Execute JavaScript
opencli browser eval --script "document.title"
# Network request capture
opencli browser network --url https://api.example.com
Mode 3: Adapter generation (the core feature)
# Explore mode: record browsing behavior, analyze page structure
opencli explore https://some-new-website.com
# Synthesize: generate an Adapter draft from recorded behavior
opencli synthesize
# Generate and verify: AI-assisted generation + automatic test validation
opencli generate --url https://some-new-website.com --action "get article list"
# Cascade validation: auto-detect auth strategy (OAuth/Cookie/2FA)
opencli cascade
How It Compares
| Dimension | OpenCLI | Browser Use | Stagehand | Playwright |
|---|---|---|---|---|
| Runtime LLM cost | ✅ Zero | ❌ Every call | ❌ Partial | ✅ Zero |
| Account security | ✅ Reuses logged-in Chrome | ❌ Needs credential injection | ❌ Needs credential injection | ❌ Manual cookie management |
| AI Agent friendly | ✅ Native Skills | ✅ Direct use | ✅ Direct use | Needs wrapping |
| Execution stability | ✅ Deterministic | ❌ LLM randomness | Medium | ✅ High |
| Self-repair | ✅ Self-Repair Protocol | Relies on LLM retry | None | None |
| Learning curve | Low (direct CLI) | Medium (Python API) | Medium (TS API) | High (write scripts) |
Why choose OpenCLI?
- Cost optimal: Only consumes LLM when generating an Adapter; all subsequent executions are free
- Security optimal: Credentials never leave the local browser
- Stability optimal: Deterministic execution, unaffected by model version updates
- Database engineer's design rigor: System-level thinking, Self-Repair Protocol and other designs show careful engineering discipline
Deep Dive
Architecture Overview
OpenCLI's architecture has four layers, with the Browser Bridge Extension and CDP protocol at the core:
┌─────────────────────────────────────────┐
│ AI Agent / Human User │
└──────────────┬──────────────────────────┘
│ opencli <command> [--format json]
┌──────────────▼──────────────────────────┐
│ OpenCLI CLI Layer │
│ Commander.js + Plugin Registry │
│ execution.ts / commanderAdapter.ts │
└──────────────┬──────────────────────────┘
│ CDP Protocol (WebSocket)
┌──────────────▼──────────────────────────┐
│ Browser Bridge Extension │
│ (Chrome Extension, local WS server) │
└──────────────┬──────────────────────────┘
│ Reuses real user session
┌──────────────▼──────────────────────────┐
│ Chrome / Chromium │
│ (The user's actual running browser) │
└─────────────────────────────────────────┘
Key design decision: OpenCLI does not launch a separate headless browser (the Puppeteer/Playwright approach). Instead, it uses Chrome Extension native messaging APIs to connect to the user's currently running Chrome instance. The critical benefits:
- Reuses all existing logged-in sessions (Twitter, GitHub, company intranet...)
- Triggers normal human browser fingerprints (harder for anti-bot systems to detect)
- Credentials never touch the Node.js process
Project Directory Structure
OpenCLI/
├── src/
│ ├── cli.ts # CLI entry point
│ ├── main.ts # Main program
│ ├── commanderAdapter.ts # Commander.js command parsing
│ ├── execution.ts # Command execution engine
│ ├── plugin.ts # Plugin system
│ ├── registry.ts # Adapter registry
│ ├── generate.ts # Adapter generator
│ ├── generate-verified.ts # Generator with verification
│ ├── browser/ # CDP browser control layer
│ └── pipeline/ # Execution pipeline
├── clis/ # 91 built-in Adapters (.js files)
│ ├── bilibili.js
│ ├── github.js
│ ├── hackernews.js
│ └── ...
├── extension/ # Browser Bridge Chrome Extension
├── skills/ # AI Agent Skill definitions
│ ├── opencli-explorer/ # Explore and generate Adapters
│ ├── opencli-browser/ # Low-level browser control
│ ├── opencli-usage/ # Help discovery
│ └── opencli-oneshot/ # Single execution
└── package.json
Anatomy of an Adapter
Each Adapter is a standard JS module describing "how to extract structured data from a specific website":
// clis/hackernews.js (simplified)
module.exports = {
name: 'hacker-news',
description: 'Fetch Hacker News stories',
commands: {
top: {
description: 'Get top stories',
options: [
{ flag: '--limit <n>', description: 'Number of stories', default: 30 }
],
execute: async (options, browser) => {
// 1. Navigate to the target page
await browser.goto('https://news.ycombinator.com/');
// 2. Extract data with deterministic CSS selectors
const stories = await browser.eval(`
Array.from(document.querySelectorAll('.athing'))
.slice(0, ${options.limit})
.map(el => ({
rank: el.querySelector('.rank')?.innerText,
title: el.querySelector('.titleline > a')?.innerText,
url: el.querySelector('.titleline > a')?.href,
points: el.nextElementSibling
?.querySelector('.score')?.innerText
}))
`);
// 3. Return structured data (CLI layer handles formatting)
return stories;
}
}
}
};
The key: the execute function is purely deterministic — given the same page, it always returns the same structure. No LLM involvement at runtime. If a CSS selector breaks due to a page update, the Self-Repair Protocol kicks in.
Self-Repair Protocol (New in v1.7.0)
This is one of OpenCLI's most technically interesting designs. When an Adapter fails, the system automatically initiates a repair cycle — no human intervention required:
Adapter executes
↓
Failure (selector mismatch / page structure changed)
↓
[Step 1] Enable structured diagnostics (OPENCLI_DIAGNOSTIC=1)
→ Capture error type, DOM snapshot, execution context
↓
[Step 2] Send diagnostics to LLM
→ Analyze: which selector broke? How did the page structure change?
↓
[Step 3] LLM generates repaired Adapter code
↓
[Step 4] Auto-replace and retry (up to 3 times)
↓
Success → save repaired Adapter
Failure (after 3 attempts) → escalate to user for manual intervention
This design constrains LLM usage to necessary moments (repair events) rather than the normal execution path — a precise balance between cost and reliability.
Adapter Generation: The Four-Step Cascade
Generating an Adapter for a new website is OpenCLI's most complex and valuable feature, structured as four commands:
# Step 1: explore
# Record interactions in the real browser; analyze page structure and auth strategy
opencli explore https://target-site.com
# Step 2: synthesize
# Transform recorded interactions into an Adapter structure draft
opencli synthesize
# Step 3: generate
# AI-assisted conversion of the draft to complete, executable Adapter code
# with automatic test validation
opencli generate --url https://target-site.com --action "extract article list"
# Step 4: cascade
# Validate the auth strategy (OAuth, Cookie, 2FA)
# Ensure the Adapter works in the real production environment
opencli cascade
The four commands can run step-by-step (with debugging at each stage) or as a single end-to-end pipeline.
AI Agent Skills Integration
OpenCLI ships 4 native Skills for direct use in Claude Code and other AI tools:
# Install all Skills
npx skills add jackwener/opencli
| Skill | Purpose |
|---|---|
opencli-explorer |
Guide AI through the complete explore + generate Adapter workflow |
opencli-browser |
Low-level browser control (click, type, screenshot, eval, network capture) |
opencli-usage |
Help AI discover which Adapters are available |
opencli-oneshot |
Single execution of a specific operation for rapid experimentation |
Real usage scenario in Claude Code:
User: "Grab today's top 20 from Hacker News and summarize them as a report"
Claude Code:
1. Invoke opencli-usage → discover hackernews Adapter is available
2. Invoke opencli-oneshot → opencli hacker-news top --limit 20 --format json
3. Parse JSON data, generate Markdown report
4. Zero additional LLM calls during step 2 (pure CLI execution)
Resources
Official
- 🌟 GitHub: https://github.com/jackwener/OpenCLI
- 📦 npm: @jackwener/opencli
- 🐛 Issues: https://github.com/jackwener/OpenCLI/issues
- 📦 Releases: https://github.com/jackwener/OpenCLI/releases
Related Projects
- Browser Use — Python AI browser automation agent
- Stagehand — TypeScript AI browser automation
- Playwright — Low-level browser automation framework
Summary
Key Takeaways
- Compile-time vs runtime intelligence: OpenCLI's foundational design philosophy — AI only participates in Adapter generation; execution requires zero LLM calls. This is a cross-domain transplant of database query optimization thinking
- Account security model: Browser Bridge Extension reuses real Chrome sessions — credentials never exposed. This is the most fundamental difference from all other browser automation tools
- Self-Repair Protocol: Automatic Adapter repair on failure, constraining LLM usage to "necessary moments" — an elegant balance of stability and cost
- Four-step Adapter generation: explore → synthesize → generate → cascade; a systematic Adapter creation process, not a one-shot prompt
- Database engineer's rigor: As an Apache Arrow/DataFusion PMC member, jackwener brings infrastructure engineering's determinism to the AI toolchain space
Who Should Use This
- AI Agent developers: Engineers who need stable, zero-cost Web operation interfaces for their agents
- Automation enthusiasts: Developers who want to turn frequently-used websites into command-line tools integrated into their own workflows
- Security-conscious teams: Organizations that don't want account credentials exposed to AI tools
- Claude Code / Codex power users: Developers who want to extend their AI tools' web operation capabilities
A Design Insight
OpenCLI's success reveals a generalizable engineering principle: in AI systems, the cost-optimal solution is usually to front-load AI decision-making to design time or first-run time, rather than every runtime. This applies beyond web automation — code template generation, workflow planning, system configuration optimization all have similar optimization opportunities.
As long as AI tool usage remains non-trivial in cost, the "generate once, execute many times" pattern will emerge in more and more domains.
Visit my personal site for more useful knowledge and interesting products
Top comments (0)