WonderLab

Posted on Apr 15 • Edited on Jun 4

One Open Source Project a Day (No.39): OpenCLI - Turning Any Website Into a Zero-Cost CLI Tool for AI Agents

#ai #opensource #cli #typescript

Introduction

"AI consumes intelligence at generation time. At execution time, it consumes nothing."

This is article No.74 in the "One Open Source Project a Day" series. Today's project is OpenCLI (GitHub).

The dominant approach to having AI Agents interact with browsers today is to let the LLM analyze the DOM or a screenshot in real time, then decide what to click. This has two fundamental problems: every execution burns a large number of tokens, and the results are unstable — the same operation might succeed today and fail tomorrow because the page changed slightly.

OpenCLI takes a completely different approach: generate first, execute later. Use AI once to generate a deterministic CLI Adapter for a website — then every subsequent execution of that Adapter requires zero LLM calls. Cost is zero. Stability is 100%.

Even more importantly, OpenCLI connects to the user's already-logged-in Chrome session via a Browser Bridge Extension. Account credentials never leave the browser.

15.6k Stars, from Apache Arrow/DataFusion PMC member jackwener — one of the most important open-source projects in the AI Agent infrastructure space right now.

What You'll Learn

OpenCLI's core philosophy: the "compile-time intelligence vs runtime intelligence" system design principle
How CDP (Chrome DevTools Protocol) drives a real Chrome session
The Adapter lifecycle: explore → synthesize → generate → cascade validation
Self-Repair Protocol: automatic Adapter repair mechanism
Using 91 built-in Adapters + generating new ones for any website

Prerequisites

Basic CLI tool usage
Basic TypeScript / Node.js knowledge (optional, for reading source code)
Familiarity with AI Agent concepts

Project Background

What Is It?

OpenCLI's full positioning: convert websites, browser sessions, Electron apps, and local tools into deterministic CLI interfaces, serving both human users and AI Agents.

Three key words in that sentence:

Deterministic: Adapter execution results are predictable — no LLM randomness
CLI interface: Any AI Agent can invoke it via standard shell commands
Human + AI dual scenario: Works as a daily command-line utility and as a tool layer for AI Agents

The project's philosophy comes from a core concept in database engineering — query optimization: consume computational resources at compile time (query planning) to make optimal decisions, then execute efficiently at runtime following the plan, without making expensive decisions during execution. OpenCLI transplants this thinking into Web automation.

About the Author

Author: jackwener (real name: jakevin)
Location: Hangzhou, China
GitHub Followers: 2,200+
PMC Member & Committer: Apache Arrow, Apache DataFusion, Apache Doris
Work History: MegaETH, SelectDB (Chinese ClickHouse vendor), ByteDance RDS, NebulaGraph
Technical Expertise: Database query engines, Rust, Java, Go, Python, C++
Related Projects: opencode-ios (iOS AI coding assistant)

jackwener is a senior database / infrastructure engineer. OpenCLI is his crossover into AI Agent toolchain territory, and the system design thinking is clearly shaped by database engineering — "deterministic execution" and "compile once, run many times" are classic database optimization concepts.

Project Stats

⭐ GitHub Stars: 15,600+
🍴 Forks: 1,500+
🐛 Open Issues: 39
🔀 Open PRs: 49
📝 Total Commits: 845
📦 Latest Version: v1.7.0 (April 11, 2026)
📄 License: Apache 2.0
🔌 Built-in Adapters: 91

Key Features

Core Purpose

AI Agent interactions with the Web face two fundamental tensions:

Problem 1: Runtime cost

Traditional approach (Browser Use / Stagehand / etc.):
Execute task → LLM analyzes DOM → LLM decides click target →
LLM validates result → LLM decides next step...
Every execution burns large tokens. 100 executions = 100 LLM calls.

OpenCLI approach:
[Once] Generate Adapter (LLM consumed) → written to .js file
[Many times] Execute Adapter (zero LLM, pure deterministic JS)
100 executions = 1 LLM call

Problem 2: Account security

Traditional scraping / automation:
Browser controller needs Cookie / password
→ credentials exposed to code → security risk

OpenCLI approach:
Browser Bridge Extension connects to the user's running Chrome
→ reuses already-logged-in sessions
→ credentials never leave the browser
→ looks identical to normal user activity

Use Cases

Stable tool layer for AI Agents
- Provide Claude Code, Codex, and other AI tools with a reliable Web operation interface — replacing the unstable "screenshot and ask LLM" pattern
Daily command-line productivity
- opencli bilibili trending | head -10 — real-time Bilibili trending
- opencli twitter search "AI agent" --format csv > output.csv — export search results
Private website automation
- Generate CLI interfaces for internal company tools or personal frequently-used sites; automate data extraction and operations
Electron desktop app control
- Drive Cursor, Notion, Discord, ChatGPT Desktop, and other Electron apps via CDP for automation
CI/CD data collection
- Standard Unix exit codes make OpenCLI seamlessly integrable into CI/CD pipelines for automated competitor monitoring, metrics collection, etc.

Quick Start

# 1. Install OpenCLI globally
npm install -g @jackwener/opencli

# 2. Clone and load the Browser Bridge Chrome Extension
git clone https://github.com/jackwener/OpenCLI.git
# Open chrome://extensions
# Enable "Developer mode"
# Click "Load unpacked" → select the extension/ directory

# 3. Verify connection
opencli doctor

# 4. Use a built-in Adapter immediately
opencli bilibili trending --format json
opencli bilibili search "Rust tutorial" --limit 20
opencli browser screenshot --url https://github.com

# 5. Install Skills for AI Agents
npx skills add jackwener/opencli

Built-in Adapter Coverage

91 built-in Adapters covering major platforms:

Category	Platforms
Video	Bilibili, YouTube
Social	Twitter/X, GitHub
Search	Bing, DuckDuckGo
News	Hacker News, Product Hunt
Shopping	Multiple platforms
Local tools	Obsidian, Docker (via CLI Hub)

Three Operating Modes

Mode 1: Direct execution with built-in Adapters

# Get Bilibili trending, JSON output
opencli bilibili trending --format json

# Search GitHub repositories
opencli github search "react hooks" --sort stars --limit 20

# Output formats: json / csv / yaml / markdown / table
opencli hacker-news top --format table

Mode 2: Real-time browser control

# Screenshot
opencli browser screenshot --url https://example.com --output ./screenshot.png

# Click an element
opencli browser click --selector "#submit-button"

# Type text
opencli browser type --selector "input[name=search]" --text "OpenCLI"

# Execute JavaScript
opencli browser eval --script "document.title"

# Network request capture
opencli browser network --url https://api.example.com

Mode 3: Adapter generation (the core feature)

# Explore mode: record browsing behavior, analyze page structure
opencli explore https://some-new-website.com

# Synthesize: generate an Adapter draft from recorded behavior
opencli synthesize

# Generate and verify: AI-assisted generation + automatic test validation
opencli generate --url https://some-new-website.com --action "get article list"

# Cascade validation: auto-detect auth strategy (OAuth/Cookie/2FA)
opencli cascade

How It Compares

Dimension	OpenCLI	Browser Use	Stagehand	Playwright
Runtime LLM cost	✅ Zero	❌ Every call	❌ Partial	✅ Zero
Account security	✅ Reuses logged-in Chrome	❌ Needs credential injection	❌ Needs credential injection	❌ Manual cookie management
AI Agent friendly	✅ Native Skills	✅ Direct use	✅ Direct use	Needs wrapping
Execution stability	✅ Deterministic	❌ LLM randomness	Medium	✅ High
Self-repair	✅ Self-Repair Protocol	Relies on LLM retry	None	None
Learning curve	Low (direct CLI)	Medium (Python API)	Medium (TS API)	High (write scripts)

Why choose OpenCLI?

Cost optimal: Only consumes LLM when generating an Adapter; all subsequent executions are free
Security optimal: Credentials never leave the local browser
Stability optimal: Deterministic execution, unaffected by model version updates
Database engineer's design rigor: System-level thinking, Self-Repair Protocol and other designs show careful engineering discipline

Deep Dive

Architecture Overview

OpenCLI's architecture has four layers, with the Browser Bridge Extension and CDP protocol at the core:

┌─────────────────────────────────────────┐
│           AI Agent / Human User         │
└──────────────┬──────────────────────────┘
               │ opencli <command> [--format json]
┌──────────────▼──────────────────────────┐
│           OpenCLI CLI Layer             │
│   Commander.js + Plugin Registry        │
│   execution.ts / commanderAdapter.ts    │
└──────────────┬──────────────────────────┘
               │ CDP Protocol (WebSocket)
┌──────────────▼──────────────────────────┐
│       Browser Bridge Extension          │
│  (Chrome Extension, local WS server)    │
└──────────────┬──────────────────────────┘
               │ Reuses real user session
┌──────────────▼──────────────────────────┐
│         Chrome / Chromium               │
│  (The user's actual running browser)    │
└─────────────────────────────────────────┘

Key design decision: OpenCLI does not launch a separate headless browser (the Puppeteer/Playwright approach). Instead, it uses Chrome Extension native messaging APIs to connect to the user's currently running Chrome instance. The critical benefits:

Reuses all existing logged-in sessions (Twitter, GitHub, company intranet...)
Triggers normal human browser fingerprints (harder for anti-bot systems to detect)
Credentials never touch the Node.js process

Project Directory Structure

OpenCLI/
├── src/
│   ├── cli.ts               # CLI entry point
│   ├── main.ts              # Main program
│   ├── commanderAdapter.ts  # Commander.js command parsing
│   ├── execution.ts         # Command execution engine
│   ├── plugin.ts            # Plugin system
│   ├── registry.ts          # Adapter registry
│   ├── generate.ts          # Adapter generator
│   ├── generate-verified.ts # Generator with verification
│   ├── browser/             # CDP browser control layer
│   └── pipeline/            # Execution pipeline
├── clis/                    # 91 built-in Adapters (.js files)
│   ├── bilibili.js
│   ├── github.js
│   ├── hackernews.js
│   └── ...
├── extension/               # Browser Bridge Chrome Extension
├── skills/                  # AI Agent Skill definitions
│   ├── opencli-explorer/    # Explore and generate Adapters
│   ├── opencli-browser/     # Low-level browser control
│   ├── opencli-usage/       # Help discovery
│   └── opencli-oneshot/     # Single execution
└── package.json

Anatomy of an Adapter

Each Adapter is a standard JS module describing "how to extract structured data from a specific website":

// clis/hackernews.js (simplified)
module.exports = {
  name: 'hacker-news',
  description: 'Fetch Hacker News stories',
  commands: {
    top: {
      description: 'Get top stories',
      options: [
        { flag: '--limit <n>', description: 'Number of stories', default: 30 }
      ],
      execute: async (options, browser) => {
        // 1. Navigate to the target page
        await browser.goto('https://news.ycombinator.com/');

        // 2. Extract data with deterministic CSS selectors
        const stories = await browser.eval(`
          Array.from(document.querySelectorAll('.athing'))
            .slice(0, ${options.limit})
            .map(el => ({
              rank: el.querySelector('.rank')?.innerText,
              title: el.querySelector('.titleline > a')?.innerText,
              url: el.querySelector('.titleline > a')?.href,
              points: el.nextElementSibling
                        ?.querySelector('.score')?.innerText
            }))
        `);

        // 3. Return structured data (CLI layer handles formatting)
        return stories;
      }
    }
  }
};

The key: the execute function is purely deterministic — given the same page, it always returns the same structure. No LLM involvement at runtime. If a CSS selector breaks due to a page update, the Self-Repair Protocol kicks in.

Self-Repair Protocol (New in v1.7.0)

This is one of OpenCLI's most technically interesting designs. When an Adapter fails, the system automatically initiates a repair cycle — no human intervention required:

Adapter executes
    ↓
Failure (selector mismatch / page structure changed)
    ↓
[Step 1] Enable structured diagnostics (OPENCLI_DIAGNOSTIC=1)
  → Capture error type, DOM snapshot, execution context
    ↓
[Step 2] Send diagnostics to LLM
  → Analyze: which selector broke? How did the page structure change?
    ↓
[Step 3] LLM generates repaired Adapter code
    ↓
[Step 4] Auto-replace and retry (up to 3 times)
    ↓
Success → save repaired Adapter
Failure (after 3 attempts) → escalate to user for manual intervention

This design constrains LLM usage to necessary moments (repair events) rather than the normal execution path — a precise balance between cost and reliability.

Adapter Generation: The Four-Step Cascade

Generating an Adapter for a new website is OpenCLI's most complex and valuable feature, structured as four commands:

# Step 1: explore
# Record interactions in the real browser; analyze page structure and auth strategy
opencli explore https://target-site.com

# Step 2: synthesize
# Transform recorded interactions into an Adapter structure draft
opencli synthesize

# Step 3: generate
# AI-assisted conversion of the draft to complete, executable Adapter code
# with automatic test validation
opencli generate --url https://target-site.com --action "extract article list"

# Step 4: cascade
# Validate the auth strategy (OAuth, Cookie, 2FA)
# Ensure the Adapter works in the real production environment
opencli cascade

The four commands can run step-by-step (with debugging at each stage) or as a single end-to-end pipeline.

AI Agent Skills Integration

OpenCLI ships 4 native Skills for direct use in Claude Code and other AI tools:

# Install all Skills
npx skills add jackwener/opencli

Skill	Purpose
`opencli-explorer`	Guide AI through the complete explore + generate Adapter workflow
`opencli-browser`	Low-level browser control (click, type, screenshot, eval, network capture)
`opencli-usage`	Help AI discover which Adapters are available
`opencli-oneshot`	Single execution of a specific operation for rapid experimentation

Real usage scenario in Claude Code:

User: "Grab today's top 20 from Hacker News and summarize them as a report"

Claude Code:
1. Invoke opencli-usage → discover hackernews Adapter is available
2. Invoke opencli-oneshot → opencli hacker-news top --limit 20 --format json
3. Parse JSON data, generate Markdown report
4. Zero additional LLM calls during step 2 (pure CLI execution)

Resources

Official

🌟 GitHub: https://github.com/jackwener/OpenCLI
📦 npm: @jackwener/opencli
🐛 Issues: https://github.com/jackwener/OpenCLI/issues
📦 Releases: https://github.com/jackwener/OpenCLI/releases

Related Projects

Browser Use — Python AI browser automation agent
Stagehand — TypeScript AI browser automation
Playwright — Low-level browser automation framework

Summary

Key Takeaways

Compile-time vs runtime intelligence: OpenCLI's foundational design philosophy — AI only participates in Adapter generation; execution requires zero LLM calls. This is a cross-domain transplant of database query optimization thinking
Account security model: Browser Bridge Extension reuses real Chrome sessions — credentials never exposed. This is the most fundamental difference from all other browser automation tools
Self-Repair Protocol: Automatic Adapter repair on failure, constraining LLM usage to "necessary moments" — an elegant balance of stability and cost
Four-step Adapter generation: explore → synthesize → generate → cascade; a systematic Adapter creation process, not a one-shot prompt
Database engineer's rigor: As an Apache Arrow/DataFusion PMC member, jackwener brings infrastructure engineering's determinism to the AI toolchain space

Who Should Use This

AI Agent developers: Engineers who need stable, zero-cost Web operation interfaces for their agents
Automation enthusiasts: Developers who want to turn frequently-used websites into command-line tools integrated into their own workflows
Security-conscious teams: Organizations that don't want account credentials exposed to AI tools
Claude Code / Codex power users: Developers who want to extend their AI tools' web operation capabilities

A Design Insight

OpenCLI's success reveals a generalizable engineering principle: in AI systems, the cost-optimal solution is usually to front-load AI decision-making to design time or first-run time, rather than every runtime. This applies beyond web automation — code template generation, workflow planning, system configuration optimization all have similar optimization opportunities.

As long as AI tool usage remains non-trivial in cost, the "generate once, execute many times" pattern will emerge in more and more domains.

Check out PrimeSkills — a curated marketplace of AI agents and skills that have been validated in real-world, enterprise-grade workflows. No fluff, just what actually works.

Find more useful knowledge and interesting products on my Homepage

DEV Community