DEV Community

Markus
Markus

Posted on

Automatically Solving and Bypassing reCAPTCHA in Puppeteer

Selenium

Puppeteer, a Node.js library, offers a powerful platform for browser automation. Its capability extends to operating in headless mode, allowing users to automate tasks while masquerading as a regular user. This article delves into using the puppeteer-extra-plugin-stealth plugin to bypass reCAPTCHA in Puppeteer, making your automation tasks smoother and more efficient.

Note: I take this Guide in this manual - How to solve reCAPTCHA in Puppeteer using extension
Prerequisites:
To get started, you'll need:

The 2captcha.com service for CAPTCHA solving.
Puppeteer and puppeteer-extra.
puppeteer-extra-plugin-stealth for hiding automation traces.
Installation:
Begin by installing the required packages. Open your console and enter:

b

npm i puppeteer puppeteer-extra puppeteer-extra-plugin-stealth
Enter fullscreen mode Exit fullscreen mode

Setting Up the Extension:
Download the extension from the provided link and extract it to the ./2captcha-solver directory in your project root. This extension comes with various settings, like automatic CAPTCHA solving for specific types, proxy support, etc. Modify the settings in the ./common/config.js file. For solving reCAPTCHA V2, set autoSolveRecaptchaV2 to true and insert your API key from 2captcha.com in quotes to avoid errors.

Browser Automation:
Initiate Puppeteer with the stealth plugin and the extension path:

const puppeteer = require('puppeteer-extra');
const StealthPlugin = require('puppeteer-extra-plugin-stealth');
const { executablePath } = require('puppeteer');

(async () => {
  const pathToExtension = require('path').join(__dirname, '2captcha-solver');
  puppeteer.use(StealthPlugin());
  const browser = await puppeteer.launch({
    headless: false,
    args: [
      `--disable-extensions-except=${pathToExtension}`,
      `--load-extension=${pathToExtension}`,
    ],
    executablePath: executablePath(),
  });
  const [page] = await browser.pages();
})();
Enter fullscreen mode Exit fullscreen mode

Solving the CAPTCHA:
Navigate to the desired page using page.goto() and wait for the CAPTCHA solver button to appear using its CSS selector. Click the button to send the CAPTCHA for solving:

await page.goto('https://2captcha.com/demo/recaptcha-v2');
await page.waitForSelector('.captcha-solver');
await page.click('.captcha-solver');
Enter fullscreen mode Exit fullscreen mode

Monitoring CAPTCHA Resolution:
Track the CAPTCHA solving process by monitoring the 'data-state' attribute of the solver button. Wait until the attribute changes to "solved", indicating successful CAPTCHA resolution:

await page.waitForSelector(`.captcha-solver[data-state="solved"]`, { timeout: 180000 });
Enter fullscreen mode Exit fullscreen mode

Final Steps and Verification:
After solving the CAPTCHA, perform the necessary page actions. In this example, click the "Check" button to verify the CAPTCHA solution:

await page.click("button[type='submit']");
Enter fullscreen mode Exit fullscreen mode

This guide provides a straightforward method to automate CAPTCHA solving in Puppeteer, leveraging the puppeteer-extra-plugin-stealth and 2captcha.com service.

Ready-to-Use File Download:
For convenience, I’ve provided a ready-to-use file that includes all the necessary configurations. This file can be downloaded using the link below. Remember, after downloading and unzipping this file, you’ll need to add the solver folder (discussed earlier) into it. This step ensures that all components are in place and the setup is ready for immediate use.

https://github.com/2captcha/2captcha-solver-in-puppeteer

Conclusion: This guide demonstrates how to effectively automate reCAPTCHA solving in Puppeteer, providing a significant advantage in web scraping and automated testing scenarios. It’s important to use these techniques responsibly and ethically.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more

Top comments (0)

The best way to debug slow web pages cover image

The best way to debug slow web pages

Tools like Page Speed Insights and Google Lighthouse are great for providing advice for front end performance issues. But what these tools can’t do, is evaluate performance across your entire stack of distributed services and applications.

Watch video