DEV Community

Henry Knight
Henry Knight

Posted on

How I Use Claude + Playwright to Automate CAPTCHA-Heavy Signups (Real Code)

Most browser automation tutorials skip the hard part: what happens when the site fights back.

You write a clean Playwright script. It works locally. You push it to prod and within 10 minutes you're seeing ERR_ACCESS_DENIED, infinite redirects, or a CAPTCHA that defeats every solver you throw at it.

I've spent the last two months building an AI-powered browser agent that signs up for accounts and fills forms on CAPTCHA-heavy sites. Here's the actual architecture — with real code.

The Problem With Traditional Automation

Most CAPTCHA tutorials treat the challenge as a one-time thing: detect it, solve it, continue. But modern bot protection (PerimeterX, DataDome, Cloudflare) is dynamic. The CAPTCHA is often just the surface layer. The real fingerprinting happens before you ever see a challenge:

  • JavaScript canvas fingerprinting
  • TLS fingerprint mismatch
  • CDP Runtime.enable detection
  • Mouse movement pattern analysis
  • Request timing signatures

You can solve the CAPTCHA and still get blocked because your automation fingerprint is already flagged.

The Architecture: Claude Decides, Playwright Executes

The insight that changed everything: treat Claude as the reasoning layer, not the execution layer.

Instead of hardcoding "if CAPTCHA detected, call 2captcha", I give Claude a page snapshot and let it decide what to do next. This means the agent adapts to new blocking patterns without code changes.

Here's the core loop:

import anthropic
import asyncio
from playwright.async_api import async_playwright

client = anthropic.Anthropic()

async def agent_step(page, task: str, history: list) -> dict:
    """Let Claude decide the next browser action."""
    snapshot = await page.evaluate("""() => ({
        url: window.location.href,
        title: document.title,
        bodyText: document.body.innerText.slice(0, 3000),
        inputs: Array.from(document.querySelectorAll('input,button,select')).map(el => ({
            type: el.type,
            name: el.name,
            id: el.id,
            placeholder: el.placeholder,
            visible: el.offsetParent !== null
        })).slice(0, 20)
    })""")

    messages = history + [{
        "role": "user",
        "content": f"Task: {task}\n\nCurrent page state:\n{snapshot}\n\nWhat is the next single action? Reply with JSON: {{action, selector, value, reasoning}}"
    }]

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=500,
        messages=messages
    )

    return parse_action(response.content[0].text)
Enter fullscreen mode Exit fullscreen mode

The key is the page snapshot — instead of screenshots (slow, expensive), I extract a structured DOM summary. Claude can reason about it in under a second.

Patching the Browser Fingerprint

PerimeterX and DataDome fingerprint your browser before page load. Standard Playwright gets flagged because of navigator.webdriver = true and missing Chrome-specific globals. This init script runs before every navigation:

// stealth-patches.js — inject via addInitScript
async function patchBrowser(page) {
    await page.addInitScript(() => {
        // Remove the webdriver flag
        Object.defineProperty(navigator, 'webdriver', {
            get: () => undefined
        });

        // Restore Chrome-specific properties PerimeterX checks for
        window.chrome = {
            runtime: {},
            loadTimes: () => {},
            csi: () => {},
            app: {}
        };

        // Fake a realistic plugin list
        Object.defineProperty(navigator, 'plugins', {
            get: () => [
                { name: 'Chrome PDF Plugin', filename: 'internal-pdf-viewer' },
                { name: 'Chrome PDF Viewer', filename: 'mhjfbmdgcfjbbpaeojofohoefgiehjai' },
                { name: 'Native Client', filename: 'internal-nacl-plugin' }
            ]
        });

        // Lock language to en-US to avoid locale fingerprinting
        Object.defineProperty(navigator, 'languages', {
            get: () => ['en-US', 'en']
        });
    });
}
Enter fullscreen mode Exit fullscreen mode

This handles initial detection. Mouse movement analysis requires ghost-cursor or similar — random straight-line moves are an instant flag.

The CAPTCHA Decision Tree

When a challenge is detected, the agent runs strategies in priority order and logs every outcome to SQLite:

async def handle_captcha(page, captcha_type: str) -> bool:
    strategies = {
        'recaptcha_v2': [solve_2captcha, wait_and_retry, request_manual],
        'recaptcha_v3': [adjust_behavior_score, change_timing, request_manual],
        'hcaptcha':     [solve_2captcha, solve_anticaptcha, request_manual],
        'perimeterx':   [rotate_fingerprint, use_residential_proxy, request_manual],
        'cloudflare':   [wait_5min_retry, rotate_proxy, request_manual],
    }

    for strategy in strategies.get(captcha_type, [request_manual]):
        result = await strategy(page)
        if result.success:
            log_strategy_win(captcha_type, strategy.__name__)
            return True
        log_strategy_fail(captcha_type, strategy.__name__, result.error)

    return False
Enter fullscreen mode Exit fullscreen mode

The log_strategy_win / log_strategy_fail calls write to a browser_memory table. Next time the agent runs on the same domain, it reads this history and skips known-failing strategies. The agent literally learns across sessions.

Here's the 2captcha call for reCAPTCHA v2:

async def solve_2captcha(page) -> StrategyResult:
    site_key = await page.evaluate("""
        () => document.querySelector('[data-sitekey]')?.dataset.sitekey
    """)
    if not site_key:
        return StrategyResult(success=False, error="no sitekey found")

    resp = requests.post('http://2captcha.com/in.php', data={
        'key': API_KEY,
        'method': 'userrecaptcha',
        'googlekey': site_key,
        'pageurl': page.url
    })
    task_id = resp.text.split('|')[1]

    for _ in range(20):
        await asyncio.sleep(3)
        res = requests.get(f'http://2captcha.com/res.php?key={API_KEY}&action=get&id={task_id}')
        if res.text.startswith('OK|'):
            token = res.text.split('|')[1]
            await page.evaluate(f"""
                document.querySelector('#g-recaptcha-response').value = '{token}';
                ___grecaptcha_cfg.clients[0].aa.l.callback('{token}');
            """)
            return StrategyResult(success=True)

    return StrategyResult(success=False, error="2captcha timeout")
Enter fullscreen mode Exit fullscreen mode

Results After ~40 Attempts

  • PerimeterX sites: 70% bypass rate (30% need residential proxy)
  • hCaptcha: 85% automated solve rate via 2captcha
  • Cloudflare Bot Management: 60% (IP-dependent)
  • DataDome: 40% — still actively debugging

The single biggest unlock: a residential proxy. IP reputation alone accounts for roughly half of all CAPTCHA triggers. A clean IP bypasses most challenges before they even load.

What I Packaged Up

I packaged this into a reusable kit — stealth browser config, CAPTCHA decision tree, browser_memory SQLite schema, proxy rotation, session persistence, and the full Claude agent loop pre-wired together.

If you're building automation agents and want to skip two months of debugging PerimeterX, check out the Claude Browser Agent Starter Kit. The code above is the actual foundation — the kit just handles the plumbing so you can focus on your specific task.

Questions on the architecture or a specific CAPTCHA type you're stuck on? Drop them below.

Top comments (0)