In the world of AI browser automation, CAPTCHAs remain the most significant hurdle. When AI agents attempt to navigate protected pages or submit forms, these security measures often stall workflows, requiring manual human intervention.
Vibium has emerged as a powerful, next-generation automation tool designed specifically for AI agents. Built on the modern WebDriver BiDi protocol by the creators of Selenium and Appium, it offers a high-performance, standards-based way to control browsers. However, Vibium presents a unique challenge: it hardcodes the --disable-extensions flag, meaning traditional browser extension-based CAPTCHA bypassers won't work.
This is where CapSolver comes in. By utilizing the CapSolver REST API, you can bypass CAPTCHAs programmatically without needing any browser extensions. This guide will show you how to integrate CapSolver with Vibium to create seamless, automated workflows for your AI agents.
Understanding Vibium
Vibium is a streamlined browser automation platform. It is distributed as a single Go binary, making it incredibly easy to install and deploy. Unlike older tools that rely on the Chrome DevTools Protocol (CDP), Vibium leverages the WebDriver BiDi protocol for faster, bidirectional communication.
Core Advantages of Vibium
- WebDriver BiDi Support: Provides a standardized, high-speed connection to the browser.
- Native AI Integration: Includes a built-in MCP (Model Context Protocol) server, allowing AI agents to control the browser directly.
- Semantic Interaction: Agents can find elements based on their meaning (e.g., "the checkout button") rather than brittle CSS selectors.
- Cross-Language SDKs: Official support for Python, JavaScript/TypeScript, and Java.
- Zero-Config Setup: A single binary with no external dependencies.
For AI agents, Vibium acts as a bridge, allowing them to interact with the web using natural language commands while maintaining the precision of a programmatic API.
What is CapSolver?
CapSolver is an industry-leading CAPTCHA bypassing service powered by advanced AI. It provides automated solutions for a wide variety of anti-bot challenges, ensuring your automation scripts remain uninterrupted.
Supported CAPTCHA Solutions
- reCAPTCHA v2 & v3 (including Enterprise versions)
- Cloudflare Turnstile & 5-second Challenges
- AWS WAF CAPTCHA
- GeeTest v3/v4
- And many other anti-bot mechanisms.
Why the API-Based Approach is Superior for Vibium
Most automation frameworks like Playwright or Puppeteer bypass CAPTCHAs by loading a Chrome extension. Since Vibium disables extensions by default, we use the CapSolver API approach. This method is actually more robust and offers greater control.
| Feature | Extension-Based (Playwright/Puppeteer) | API-Based (Vibium + CapSolver) |
|---|---|---|
| Mechanism | Automatic detection via extension | Explicit API calls and token injection |
| Extension Required | Yes | No (Pure HTTP) |
| Agent Control | Opaque/Automatic | Full programmatic control |
| Compatibility | Limited by browser flags | Works with any configuration |
| Flexibility | Fixed logic | Customizable retry and injection logic |
By using the API, you can precisely manage when a CAPTCHA is bypassed and how the resulting token is submitted, making it the ideal choice for restricted environments.
Prerequisites
To get started, ensure you have the following:
- Vibium Installed: Get it from the official GitHub repository.
- CapSolver Account: Sign up here to get your API key.
- Development Environment: Node.js 18+, Python 3.8+, or Java 17+.
Installing Vibium
# Quick install for macOS / Linux
curl -fsSL https://vibium.dev/install.sh | bash
# Verify installation
vibium --version
Vibium manages its own browser instances, so you don't need to worry about installing specific versions of Chromium or Chrome for Testing.
Step-by-Step Integration Guide
1. Configure Your API Key
Sign up at CapSolver and retrieve your API key from the dashboard. Set it as an environment variable:
export CAPSOLVER_API_KEY="CAP-YOUR_ACTUAL_API_KEY"
2. Install Dependencies
For Node.js:
npm install vibium
For Python:
pip install vibium requests
3. Detect CAPTCHAs on the Page
Use Vibium's browser_evaluate to inspect the DOM and identify the CAPTCHA type and site key.
JavaScript Example:
const { browser } = require('vibium/sync')
function detectCaptcha(page) {
return page.evaluate(`(() => {
const v2 = document.querySelector('.g-recaptcha');
if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };
for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
const m = s.src.match(/render=([^&]+)/);
if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
}
const t = document.querySelector('.cf-turnstile');
if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };
return { type: 'none', siteKey: null };
})()`)
}
Python Example:
from vibium import browser
def detect_captcha(page) -> dict:
return page.evaluate("""(() => {
const v2 = document.querySelector('.g-recaptcha');
if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };
for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
const m = s.src.match(/render=([^&]+)/);
if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
}
const t = document.querySelector('.cf-turnstile');
if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };
return { type: 'none', siteKey: null };
})()""")
Java Example:
var result = page.evaluate("""
(() => {
const v2 = document.querySelector('.g-recaptcha');
if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };
for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
const m = s.src.match(/render=([^&]+)/);
if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
}
const t = document.querySelector('.cf-turnstile');
if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };
return { type: 'none', siteKey: null };
})()
""");
String captchaType = (String) ((Map) result).get("type");
String siteKey = (String) ((Map) result).get("siteKey");
4. Bypass and Inject the Token
Once detected, call the CapSolver API to bypass the challenge and inject the resulting token back into the page.
JavaScript Implementation:
const CAPSOLVER_API = 'https://api.capsolver.com'
const API_KEY = process.env.CAPSOLVER_API_KEY
async function createTask(taskData) {
const res = await fetch(`${CAPSOLVER_API}/createTask`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ clientKey: API_KEY, task: taskData }),
})
const data = await res.json()
if (data.errorId !== 0) throw new Error(`CapSolver: ${data.errorDescription}`)
return data.taskId
}
async function getTaskResult(taskId, maxAttempts = 60) {
for (let i = 0; i < maxAttempts; i++) {
await new Promise(r => setTimeout(r, 2000))
const res = await fetch(`${CAPSOLVER_API}/getTaskResult`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ clientKey: API_KEY, taskId }),
})
const data = await res.json()
if (data.status === 'ready') return data
if (data.status === 'failed') throw new Error(`Failed: ${data.errorDescription}`)
}
throw new Error('Timeout')
}
Full Workflow (Python):
from vibium import browser
import os, time, requests
CAPSOLVER_API = "https://api.capsolver.com"
API_KEY = os.environ["CAPSOLVER_API_KEY"]
def main():
bro = browser.start()
page = bro.page()
# 1. Navigate to the target page
target_url = "https://example.com/protected-page"
page.go(target_url)
# 2. Detect the CAPTCHA
info = page.evaluate("""(() => {
const el = document.querySelector('.g-recaptcha');
return el ? { type: 'recaptcha-v2', siteKey: el.getAttribute('data-sitekey') }
: { type: 'none', siteKey: null };
})()""")
if info["type"] == "none":
print("No CAPTCHA found.")
return
print(f"Detected {info['type']} — key {info['siteKey']}")
# 3. Bypass via CapSolver API
# (Assuming solve_captcha helper is implemented)
token = solve_captcha(info, target_url)
print("Solved!")
# 4. Inject the token and submit the form
page.evaluate(f"""
document.querySelector('textarea[name="g-recaptcha-response"]').value = "{token}";
try {{ const c = ___grecaptcha_cfg.clients; for (const id in c) {{
const f = (o) => {{ for (const k in o) {{ if (typeof o[k]==='object'&&o[k]!==null) {{
if (typeof o[k].callback==='function'){{o[k].callback("{token}");return true}}
if(f(o[k]))return true}}}} return false}}; f(c[id]) }}}} catch(e){{}}
""")
page.evaluate('document.querySelector("#recaptcha-demo-form").submit()')
# 5. Verify success
time.sleep(2)
print("Result:", page.evaluate("document.body.innerText"))
bro.stop()
main()
Supported CAPTCHA Task Types
| CAPTCHA Type | CapSolver Task Type | Token Injection Field |
|---|---|---|
| reCAPTCHA v2 | ReCaptchaV2TaskProxyLess |
textarea[name="g-recaptcha-response"] |
| reCAPTCHA v3 | ReCaptchaV3TaskProxyLess |
input[name="g-recaptcha-response"] |
| Cloudflare Turnstile | AntiTurnstileTaskProxyLess |
input[name="cf-turnstile-response"] |
| AWS WAF | AntiAwsWafTaskProxyLess |
Site-specific |
Troubleshooting & Best Practices
Common Issues
- Token Expiration: CAPTCHA tokens usually expire within 2 minutes. Ensure you inject and submit the form immediately after receiving the token.
- CORS Errors: Never call the CapSolver API from within
browser_evaluate. Always make API calls from your main script (Node/Python/Java) to avoid security and cross-origin issues. - Callback Functions: Many sites use JavaScript callbacks to handle CAPTCHA submission. Use the injection script provided above to find and trigger these callbacks automatically.
Best Practices for High Reliability
- Polling Interval: Poll the CapSolver API every 2 seconds. This is the optimal balance between speed and efficiency.
- Retry Logic: Implement exponential backoff for your API calls to handle transient network failures.
- Balance Monitoring: Check your CapSolver balance programmatically before starting large automation runs to avoid interruptions.
Conclusion
Integrating Vibium with the CapSolver API provides a robust, future-proof solution for bypassing CAPTCHAs in AI-driven browser automation. While Vibium's restriction on extensions might seem like a limitation, the API-based approach offers superior control and flexibility.
By following this guide, you can ensure your AI agents navigate the web smoothly, overcoming security obstacles with ease. Ready to scale your automation? Sign up for CapSolver today and start bypassing!

Top comments (0)