DEV Community

luisgustvo
luisgustvo

Posted on

DrissionPage and CapSolver: Building Stealthy and Efficient Web Automation Tools

DrissionPage Captcha Solving

1. Introduction: The Automation Challenge

Web automation is crucial for data scraping, automated testing, and various business operations. However, modern websites employ increasingly sophisticated anti-bot measures and CAPTCHAs that can bring even the most robust automation scripts to a halt.

The powerful combination of DrissionPage and CapSolver offers a definitive solution to this challenge:

  • DrissionPage: A Python-based web automation tool that controls Chromium browsers without relying on WebDriver, effectively bypassing common WebDriver detection. It seamlessly integrates browser automation with HTTP request capabilities.
  • CapSolver: An AI-powered CAPTCHA solving service that handles a wide range of complex CAPTCHAs, including Cloudflare Turnstile and reCAPTCHA.

Together, these tools enable a smooth web automation workflow that overcomes both WebDriver detection and CAPTCHA hurdles.

1.1. Integration Objectives

This guide focuses on achieving three core goals:

  1. Avoid WebDriver Detection: Utilize DrissionPage's native browser control to eliminate the tell-tale signs of Selenium/WebDriver.
  2. Automate CAPTCHA Solving: Integrate CapSolver's API to handle CAPTCHA challenges automatically, removing the need for manual intervention.
  3. Maintain Human-Like Behavior: Combine DrissionPage's Action Chains with intelligent CAPTCHA solving for highly realistic automation.

2. Introducing DrissionPage

DrissionPage is a robust Python web automation tool that merges browser control with HTTP request functionality. Unlike traditional Selenium, it uses a self-developed kernel to control the browser, making it significantly harder for websites to detect.

2.1. Key Features

  • No WebDriver Required: Natively controls Chromium browsers without the need for chromedriver.
  • Dual Mode Operation: Supports both browser automation (d mode) and HTTP request capabilities (s mode).
  • Simplified Element Location: Intuitive syntax for finding and interacting with page elements.
  • Cross-iframe Navigation: Locate elements nested within iframes without explicit context switching.
  • Multi-tab Support: Operate multiple browser tabs simultaneously.
  • Action Chains: Supports chained mouse and keyboard actions to simulate natural user behavior.
  • Built-in Waits: Automatic retry mechanisms for handling network instability.

2.2. Installation

# Install DrissionPage
pip install DrissionPage

# Install requests for CapSolver API integration
pip install requests
Enter fullscreen mode Exit fullscreen mode

2.3. Basic Usage Example

from DrissionPage import ChromiumPage

# Create browser instance
page = ChromiumPage()

# Navigate to URL
page.get('https://wikipedia.org')

# Find and interact with elements
page('#search-input').input('Hello World')
page('#submit-btn').click()
Enter fullscreen mode Exit fullscreen mode

3. Introducing CapSolver

CapSolver is an AI-powered automatic CAPTCHA solving service that supports a wide array of CAPTCHA types. It provides a straightforward API for submitting CAPTCHA challenges and receiving solutions within seconds.

3.1. Supported CAPTCHA Types

  • Cloudflare Turnstile: The most common modern anti-bot challenge.
  • Cloudflare Challenge
  • reCAPTCHA v2: Both image-based and invisible variants.
  • reCAPTCHA v3: Score-based verification.
  • AWS WAF: Amazon Web Services CAPTCHA.
  • And many more...

3.2. Getting Started with CapSolver

  1. Sign up at capsolver.com.
  2. Add funds to your account.
  3. Retrieve your API key from the dashboard.

3.3. API Endpoints

  • Server A: https://api.capsolver.com
  • Server B: https://api-stable.capsolver.com

4. Integration Methods

4.1. API Integration (Recommended)

The API integration method offers full control over the CAPTCHA solving process and works with all supported CAPTCHA types.

4.1.1. Core Integration Pattern

Here are the core Python functions for creating tasks and polling for results:

import time
import requests
from DrissionPage import ChromiumPage

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def create_task(task_payload: dict) -> str:
    """Create a CAPTCHA solving task and return the task ID."""
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": task_payload
        }
    )
    result = response.json()
    if result.get("errorId") != 0:
        raise Exception(f"CapSolver error: {result.get('errorDescription')}")
    return result["taskId"]


def get_task_result(task_id: str, max_attempts: int = 120) -> dict:
    """Poll for task result until solved or timeout."""
    for _ in range(max_attempts):
        response = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        )
        result = response.json()

        if result.get("status") == "ready":
            return result["solution"]
        elif result.get("status") == "failed":
            raise Exception(f"Task failed: {result.get('errorDescription')}")

        time.sleep(1)

    raise TimeoutError("CAPTCHA solving timed out")


def solve_captcha(task_payload: dict) -> dict:
    """Complete CAPTCHA solving workflow."""
    task_id = create_task(task_payload)
    return get_task_result(task_id)
Enter fullscreen mode Exit fullscreen mode

4.2. Browser Extension Integration

You can also use the CapSolver browser extension with DrissionPage for a more hands-off, automatic approach.

4.2.1. Installation Steps

  1. Download the CapSolver extension from capsolver.com/en/extension.
  2. Extract the extension files.
  3. Configure your API key: Edit the config.js file within the extension folder to include your API key.
// In the extension folder, edit: assets/config.js
var defined = {
    apiKey: "YOUR_CAPSOLVER_API_KEY",  // Replace with your actual API key
    enabledForBlacklistControl: false,
    blackUrlList: [],
    enabledForRecaptcha: true,
    enabledForRecaptchaV3: true,
    enabledForTurnstile: true,
    // ... other settings
}
Enter fullscreen mode Exit fullscreen mode
  1. Load it into DrissionPage:
from DrissionPage import ChromiumPage, ChromiumOptions

co = ChromiumOptions()
co.add_extension('/path/to/capsolver-extension')

page = ChromiumPage(co)
# The extension will automatically detect and solve CAPTCHAs
Enter fullscreen mode Exit fullscreen mode

Note: The extension must have a valid API key configured to solve CAPTCHAs automatically.

5. Practical Code Examples

5.1. Solving Cloudflare Turnstile

Cloudflare Turnstile is a prevalent CAPTCHA challenge. Here is how to solve it:

import time
import requests
from DrissionPage import ChromiumPage

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def solve_turnstile(site_key: str, page_url: str) -> str:
    """Solve Cloudflare Turnstile and return the token."""
    # Create the task
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": {
                "type": "AntiTurnstileTaskProxyLess",
                "websiteURL": page_url,
                "websiteKey": site_key,
            }
        }
    )
    result = response.json()

    if result.get("errorId") != 0:
        raise Exception(f"Error: {result.get('errorDescription')}")

    task_id = result["taskId"]

    # Poll for result
    while True:
        result = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        ).json()

        if result.get("status") == "ready":
            return result["solution"]["token"]
        elif result.get("status") == "failed":
            raise Exception(f"Failed: {result.get('errorDescription')}")

        time.sleep(1)


def main():
    target_url = "https://your-target-site.com"
    turnstile_site_key = "0x4XXXXXXXXXXXXXXXXX"  # Find this in page source

    # Create browser instance
    page = ChromiumPage()
    page.get(target_url)

    # Wait for Turnstile to load
    page.wait.ele_displayed('input[name="cf-turnstile-response"]', timeout=10)

    # Solve the CAPTCHA
    token = solve_turnstile(turnstile_site_key, target_url)
    print(f"Got Turnstile token: {token[:50]}...")

    # Inject the token using JavaScript
    page.run_js(f'''
        document.querySelector('input[name="cf-turnstile-response"]').value = "{token}";

        // Also trigger the callback if present
        const callback = document.querySelector('[data-callback]');
        if (callback) {{
            const callbackName = callback.getAttribute('data-callback');
            if (window[callbackName]) {{
                window[callbackName]('{token}');
            }}
        }}
    ''')

    # Submit the form
    page('button[type="submit"]').click()
    page.wait.load_start()

    print("Successfully bypassed Turnstile!")


if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

5.2. Solving reCAPTCHA v2 (Auto-Detect Site Key)

This example demonstrates how to automatically detect the Site Key from the page, eliminating manual configuration:

import time
import requests
from DrissionPage import ChromiumPage, ChromiumOptions

CAPSOLVER_API_KEY = "YOUR_API_KEY"
CAPSOLVER_API = "https://api.capsolver.com"


def solve_recaptcha_v2(site_key: str, page_url: str) -> str:
    """Solve reCAPTCHA v2 and return the token."""
    # Create the task
    response = requests.post(
        f"{CAPSOLVER_API}/createTask",
        json={
            "clientKey": CAPSOLVER_API_KEY,
            "task": {
                "type": "ReCaptchaV2TaskProxyLess",
                "websiteURL": page_url,
                "websiteKey": site_key,
            }
        }
    )
    result = response.json()

    if result.get("errorId") != 0:
        raise Exception(f"Error: {result.get('errorDescription')}")

    task_id = result["taskId"]
    print(f"Task created: {task_id}")

    # Poll for result
    while True:
        result = requests.post(
            f"{CAPSOLVER_API}/getTaskResult",
            json={
                "clientKey": CAPSOLVER_API_KEY,
                "taskId": task_id
            }
        ).json()

        if result.get("status") == "ready":
            return result["solution"]["gRecaptchaResponse"]
        elif result.get("status") == "failed":
            raise Exception(f"Failed: {result.get('errorDescription')}")

        time.sleep(1)


def main():
    # Just provide the URL - Site Key will be auto-detected
    target_url = "https://www.google.com/recaptcha/api2/demo"

    # Configure browser
    co = ChromiumOptions()
    co.set_argument('--disable-blink-features=AutomationControlled')

    print("Starting browser...")
    page = ChromiumPage(co)

    try:
        page.get(target_url)
        time.sleep(2)

        # Auto-detect Site Key from page
        recaptcha_div = page('.g-recaptcha')
        if not recaptcha_div:
            print("No reCAPTCHA found on page!")
            return

        site_key = recaptcha_div.attr('data-sitekey')
        print(f"Auto-detected site key: {site_key}")

        # Solve the CAPTCHA
        print("Solving reCAPTCHA v2...")
        token = solve_recaptcha_v2(site_key, target_url)
        print(f"Got token: {token[:50]}...")

        # Inject the token
        page.run_js(f'''
            var responseField = document.getElementById('g-recaptcha-response');
            responseField.style.display = 'block';
            responseField.value = '{token}';
        ''')
        print("Token injected!")

        # Submit the form
        submit_btn = page('#recaptcha-demo-submit') or page('input[type="submit"]') or page('button[type="submit"]')
        if submit_btn:
            submit_btn.click()
            time.sleep(3)
            print("Form submitted!")

        print(f"Current URL: {page.url}")
        print("SUCCESS!")

    finally:
        page.quit()


if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

5.3. Using Action Chains for Human-Like Behavior

DrissionPage's Action Chains provide natural mouse movements and keyboard input, further enhancing anti-detection capabilities:

import time
import random
from DrissionPage import ChromiumPage
from DrissionPage.common import Keys, Actions

def human_delay():
    """Random delay to mimic human behavior."""
    time.sleep(random.uniform(0.5, 1.5))

def main():
    page = ChromiumPage()
    page.get('https://your-target-site.com/form')

    # Use action chains for human-like interactions
    ac = Actions(page)

    # Move to input field naturally, then click and type
    ac.move_to('input[name="email"]').click()
    human_delay()

    # Type slowly like a human
    for char in "user@email.com":
        ac.type(char)
        time.sleep(random.uniform(0.05, 0.15))

    human_delay()

    # Move to password field
    ac.move_to('input[name="password"]').click()
    human_delay()

    # Type password
    page('input[name="password"]').input("mypassword123")

    # After solving CAPTCHA, click submit with natural movement
    ac.move_to('button[type="submit"]')
    human_delay()
    ac.click()

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

6. Best Practices

6.1. Optimized Browser Configuration

Configure DrissionPage to appear more like a regular browser:

from DrissionPage import ChromiumPage, ChromiumOptions

co = ChromiumOptions()
# Disable WebDriver features
co.set_argument('--disable-blink-features=AutomationControlled')
co.set_argument('--no-sandbox')
# Set a common User-Agent
co.set_user_agent('Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36')

# Set a common window resolution
co.set_argument('--window-size=1920,1080')

page = ChromiumPage(co)
Enter fullscreen mode Exit fullscreen mode

6.2. Random Delays and Rate Limiting

Avoid triggering rate limits by adding random delays:

import random
import time

def human_delay(min_sec=1.0, max_sec=3.0):
    """Random delay to mimic human behavior."""
    time.sleep(random.uniform(min_sec, max_sec))

# Use between actions
page('#button1').click()
human_delay()
page('#input1').input('text')
Enter fullscreen mode Exit fullscreen mode

6.3. Error Handling and Retry Mechanism

Always implement proper error handling and retry logic for CAPTCHA solving:

def solve_with_retry(task_payload: dict, max_retries: int = 3) -> dict:
    """Solve CAPTCHA with retry logic."""
    for attempt in range(max_retries):
        try:
            return solve_captcha(task_payload)
        except TimeoutError:
            if attempt < max_retries - 1:
                print(f"Timeout, retrying... ({attempt + 1}/{max_retries})")
                time.sleep(5)
            else:
                raise
        except Exception as e:
            if "balance" in str(e).lower():
                raise  # Do not retry on balance errors
            if attempt < max_retries - 1:
                time.sleep(2)
            else:
                raise
Enter fullscreen mode Exit fullscreen mode

7. Conclusion

The integration of DrissionPage and CapSolver creates a powerful toolkit for web automation:

  • DrissionPage handles browser automation while avoiding WebDriver detection signatures.
  • CapSolver manages CAPTCHAs with its AI-powered solving capabilities.
  • Together they enable seamless automation that appears entirely human.

Whether you are building web scrapers, automated testing systems, or data collection pipelines, this combination provides the reliability and stealth you need.

Bonus: Use code DRISSION when signing up at CapSolver to receive bonus credits!

8. Frequently Asked Questions (FAQ)

8.1. Why choose DrissionPage over Selenium?

DrissionPage does not use WebDriver, which means:

  • No need to download/update chromedriver.
  • Avoids common WebDriver detection signatures.
  • Simpler API with built-in waits.
  • Better performance and resource usage.
  • Native support for cross-iframe element location.

8.2. Which CAPTCHA types work best with this integration?

CapSolver supports all major CAPTCHA types. Cloudflare Turnstile and reCAPTCHA v2/v3 have the highest success rates. The integration works seamlessly with any CAPTCHA that CapSolver supports.

8.3. Can DrissionPage handle Shadow DOM?

Yes! DrissionPage has built-in support for Shadow DOM elements through the ChromiumShadowElement class.

Top comments (0)