DEV Community

luisgustvo
luisgustvo

Posted on

How to Bypass CAPTCHAs in Vibium: A Complete Guide for AI Agents

Bypass CAPTCHA in Vibium

In the world of AI browser automation, CAPTCHAs remain the most significant hurdle. When AI agents attempt to navigate protected pages or submit forms, these security measures often stall workflows, requiring manual human intervention.

Vibium has emerged as a powerful, next-generation automation tool designed specifically for AI agents. Built on the modern WebDriver BiDi protocol by the creators of Selenium and Appium, it offers a high-performance, standards-based way to control browsers. However, Vibium presents a unique challenge: it hardcodes the --disable-extensions flag, meaning traditional browser extension-based CAPTCHA bypassers won't work.

This is where CapSolver comes in. By utilizing the CapSolver REST API, you can bypass CAPTCHAs programmatically without needing any browser extensions. This guide will show you how to integrate CapSolver with Vibium to create seamless, automated workflows for your AI agents.


Understanding Vibium

Vibium is a streamlined browser automation platform. It is distributed as a single Go binary, making it incredibly easy to install and deploy. Unlike older tools that rely on the Chrome DevTools Protocol (CDP), Vibium leverages the WebDriver BiDi protocol for faster, bidirectional communication.

Core Advantages of Vibium

  • WebDriver BiDi Support: Provides a standardized, high-speed connection to the browser.
  • Native AI Integration: Includes a built-in MCP (Model Context Protocol) server, allowing AI agents to control the browser directly.
  • Semantic Interaction: Agents can find elements based on their meaning (e.g., "the checkout button") rather than brittle CSS selectors.
  • Cross-Language SDKs: Official support for Python, JavaScript/TypeScript, and Java.
  • Zero-Config Setup: A single binary with no external dependencies.

For AI agents, Vibium acts as a bridge, allowing them to interact with the web using natural language commands while maintaining the precision of a programmatic API.


What is CapSolver?

CapSolver is an industry-leading CAPTCHA bypassing service powered by advanced AI. It provides automated solutions for a wide variety of anti-bot challenges, ensuring your automation scripts remain uninterrupted.

Supported CAPTCHA Solutions


Why the API-Based Approach is Superior for Vibium

Most automation frameworks like Playwright or Puppeteer bypass CAPTCHAs by loading a Chrome extension. Since Vibium disables extensions by default, we use the CapSolver API approach. This method is actually more robust and offers greater control.

Feature Extension-Based (Playwright/Puppeteer) API-Based (Vibium + CapSolver)
Mechanism Automatic detection via extension Explicit API calls and token injection
Extension Required Yes No (Pure HTTP)
Agent Control Opaque/Automatic Full programmatic control
Compatibility Limited by browser flags Works with any configuration
Flexibility Fixed logic Customizable retry and injection logic

By using the API, you can precisely manage when a CAPTCHA is bypassed and how the resulting token is submitted, making it the ideal choice for restricted environments.


Prerequisites

To get started, ensure you have the following:

  1. Vibium Installed: Get it from the official GitHub repository.
  2. CapSolver Account: Sign up here to get your API key.
  3. Development Environment: Node.js 18+, Python 3.8+, or Java 17+.

Installing Vibium

# Quick install for macOS / Linux
curl -fsSL https://vibium.dev/install.sh | bash

# Verify installation
vibium --version
Enter fullscreen mode Exit fullscreen mode

Vibium manages its own browser instances, so you don't need to worry about installing specific versions of Chromium or Chrome for Testing.


Step-by-Step Integration Guide

1. Configure Your API Key

Sign up at CapSolver and retrieve your API key from the dashboard. Set it as an environment variable:

export CAPSOLVER_API_KEY="CAP-YOUR_ACTUAL_API_KEY"
Enter fullscreen mode Exit fullscreen mode

2. Install Dependencies

For Node.js:

npm install vibium
Enter fullscreen mode Exit fullscreen mode

For Python:

pip install vibium requests
Enter fullscreen mode Exit fullscreen mode

3. Detect CAPTCHAs on the Page

Use Vibium's browser_evaluate to inspect the DOM and identify the CAPTCHA type and site key.

JavaScript Example:

const { browser } = require('vibium/sync')

function detectCaptcha(page) {
  return page.evaluate(`(() => {
    const v2 = document.querySelector('.g-recaptcha');
    if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

    for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
      const m = s.src.match(/render=([^&]+)/);
      if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
    }

    const t = document.querySelector('.cf-turnstile');
    if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

    return { type: 'none', siteKey: null };
  })()`)
}
Enter fullscreen mode Exit fullscreen mode

Python Example:

from vibium import browser

def detect_captcha(page) -> dict:
    return page.evaluate("""(() => {
        const v2 = document.querySelector('.g-recaptcha');
        if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

        for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
            const m = s.src.match(/render=([^&]+)/);
            if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
        }

        const t = document.querySelector('.cf-turnstile');
        if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

        return { type: 'none', siteKey: null };
    })()""")
Enter fullscreen mode Exit fullscreen mode

Java Example:

var result = page.evaluate("""
    (() => {
        const v2 = document.querySelector('.g-recaptcha');
        if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

        for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
            const m = s.src.match(/render=([^&]+)/);
            if (m && m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
        }

        const t = document.querySelector('.cf-turnstile');
        if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

        return { type: 'none', siteKey: null };
    })()
    """);
String captchaType = (String) ((Map) result).get("type");
String siteKey = (String) ((Map) result).get("siteKey");
Enter fullscreen mode Exit fullscreen mode

4. Bypass and Inject the Token

Once detected, call the CapSolver API to bypass the challenge and inject the resulting token back into the page.

JavaScript Implementation:

const CAPSOLVER_API = 'https://api.capsolver.com'
const API_KEY = process.env.CAPSOLVER_API_KEY

async function createTask(taskData) {
  const res = await fetch(`${CAPSOLVER_API}/createTask`, {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify({ clientKey: API_KEY, task: taskData }),
  })
  const data = await res.json()
  if (data.errorId !== 0) throw new Error(`CapSolver: ${data.errorDescription}`)
  return data.taskId
}

async function getTaskResult(taskId, maxAttempts = 60) {
  for (let i = 0; i < maxAttempts; i++) {
    await new Promise(r => setTimeout(r, 2000))
    const res = await fetch(`${CAPSOLVER_API}/getTaskResult`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ clientKey: API_KEY, taskId }),
    })
    const data = await res.json()
    if (data.status === 'ready') return data
    if (data.status === 'failed') throw new Error(`Failed: ${data.errorDescription}`)
  }
  throw new Error('Timeout')
}
Enter fullscreen mode Exit fullscreen mode

Full Workflow (Python):

from vibium import browser
import os, time, requests

CAPSOLVER_API = "https://api.capsolver.com"
API_KEY = os.environ["CAPSOLVER_API_KEY"]

def main():
    bro = browser.start()
    page = bro.page()

    # 1. Navigate to the target page
    target_url = "https://example.com/protected-page"
    page.go(target_url)

    # 2. Detect the CAPTCHA
    info = page.evaluate("""(() => {
        const el = document.querySelector('.g-recaptcha');
        return el ? { type: 'recaptcha-v2', siteKey: el.getAttribute('data-sitekey') }
                   : { type: 'none', siteKey: null };
    })()""")

    if info["type"] == "none":
        print("No CAPTCHA found.")
        return

    print(f"Detected {info['type']} — key {info['siteKey']}")

    # 3. Bypass via CapSolver API
    # (Assuming solve_captcha helper is implemented)
    token = solve_captcha(info, target_url)
    print("Solved!")

    # 4. Inject the token and submit the form
    page.evaluate(f"""
        document.querySelector('textarea[name="g-recaptcha-response"]').value = "{token}";
        try {{ const c = ___grecaptcha_cfg.clients; for (const id in c) {{
            const f = (o) => {{ for (const k in o) {{ if (typeof o[k]==='object'&&o[k]!==null) {{
                if (typeof o[k].callback==='function'){{o[k].callback("{token}");return true}}
                if(f(o[k]))return true}}}} return false}}; f(c[id]) }}}} catch(e){{}}
    """)
    page.evaluate('document.querySelector("#recaptcha-demo-form").submit()')

    # 5. Verify success
    time.sleep(2)
    print("Result:", page.evaluate("document.body.innerText"))
    bro.stop()

main()
Enter fullscreen mode Exit fullscreen mode

Supported CAPTCHA Task Types

CAPTCHA Type CapSolver Task Type Token Injection Field
reCAPTCHA v2 ReCaptchaV2TaskProxyLess textarea[name="g-recaptcha-response"]
reCAPTCHA v3 ReCaptchaV3TaskProxyLess input[name="g-recaptcha-response"]
Cloudflare Turnstile AntiTurnstileTaskProxyLess input[name="cf-turnstile-response"]
AWS WAF AntiAwsWafTaskProxyLess Site-specific

Troubleshooting & Best Practices

Common Issues

  • Token Expiration: CAPTCHA tokens usually expire within 2 minutes. Ensure you inject and submit the form immediately after receiving the token.
  • CORS Errors: Never call the CapSolver API from within browser_evaluate. Always make API calls from your main script (Node/Python/Java) to avoid security and cross-origin issues.
  • Callback Functions: Many sites use JavaScript callbacks to handle CAPTCHA submission. Use the injection script provided above to find and trigger these callbacks automatically.

Best Practices for High Reliability

  1. Polling Interval: Poll the CapSolver API every 2 seconds. This is the optimal balance between speed and efficiency.
  2. Retry Logic: Implement exponential backoff for your API calls to handle transient network failures.
  3. Balance Monitoring: Check your CapSolver balance programmatically before starting large automation runs to avoid interruptions.

Conclusion

Integrating Vibium with the CapSolver API provides a robust, future-proof solution for bypassing CAPTCHAs in AI-driven browser automation. While Vibium's restriction on extensions might seem like a limitation, the API-based approach offers superior control and flexibility.

By following this guide, you can ensure your AI agents navigate the web smoothly, overcoming security obstacles with ease. Ready to scale your automation? Sign up for CapSolver today and start bypassing!

Top comments (0)