<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: luisgustvo</title>
    <description>The latest articles on DEV Community by luisgustvo (@luisgustvo).</description>
    <link>https://dev.to/luisgustvo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1894995%2F6ea13137-c292-4572-8f1e-b5a30594b3e5.jpeg</url>
      <title>DEV Community: luisgustvo</title>
      <link>https://dev.to/luisgustvo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/luisgustvo"/>
    <language>en</language>
    <item>
      <title>How Agentic Browsers Bypass CAPTCHAs: AI CAPTCHA Solving Infrastructure</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Tue, 26 May 2026 09:56:22 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-agentic-browsers-bypass-captchas-ai-captcha-solving-infrastructure-2imm</link>
      <guid>https://dev.to/luisgustvo/how-agentic-browsers-bypass-captchas-ai-captcha-solving-infrastructure-2imm</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4qb554ool32g5uholxm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fo4qb554ool32g5uholxm.png" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In our &lt;a href="https://www.capsolver.com/blog/ai/agentic-browser" rel="noopener noreferrer"&gt;preceding discussion&lt;/a&gt;, we explored the evolution of the Agentic Browser from a passive "display interface" to an active "operational entity." We delved into its fundamental architecture, encompassing intent comprehension, environmental perception, and action execution. However, as these sophisticated digital agents navigate the complexities of the real-world web, they inevitably encounter a formidable gatekeeper: the CAPTCHA. This article shifts its focus to the "unseen mechanism"—the CAPTCHA resolution infrastructure—that ensures these agents can function autonomously and without interruption. We will investigate why CAPTCHAs represent a primary impediment for AI and how specialized services, such as &lt;a href="https://www.capsolver.com/?utm_source=official&amp;amp;utm_medium=blog&amp;amp;utm_campaign=agentic-browser-capsolver" rel="noopener noreferrer"&gt;CapSolver&lt;/a&gt;, furnish the essential framework required for the next generation of web automation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chapter 1: The "Unseen Mechanism" — CAPTCHA Resolution Infrastructure
&lt;/h2&gt;

&lt;p&gt;Consider this scenario: you task an Agentic Browser with securing tickets for a highly anticipated concert. It proficiently accesses the website, identifies the purchase button, and just as it prepares to click "Buy Now," a sliding puzzle or a grid of indistinct traffic-light images abruptly appears. Your digital assistant is instantly immobilized. CAPTCHA, a "Turing Test" conceived in the nascent stages of the Internet, has now emerged as the most direct—and most challenging—adversary for AI agents.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.1 Why CAPTCHA Poses the Foremost Challenge for AI Agents
&lt;/h3&gt;

&lt;p&gt;CAPTCHA, an acronym for "Completely Automated Public Turing Test to Tell Computers and Humans Apart," was originally designed with a straightforward objective: to deter bots while permitting human access. Yet, as AI capabilities have advanced, CAPTCHAs have continuously evolved in response—from basic distorted characters to intricate sliders, image-selection tasks, and sophisticated behavioral analysis systems. They are no longer merely a problem of character recognition.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e6ptufg62fn4jtz13su.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4e6ptufg62fn4jtz13su.png" alt="Figure 1-1 Contemporary Mainstream CAPTCHA Types and Their Complexity Levels" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For conventional automation scripts, CAPTCHAs often signify an insurmountable barrier. For Agentic Browsers, they present an equally severe challenge due to three principal factors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;A significant escalation in perception difficulty:&lt;/strong&gt; Even the most advanced multimodal models struggle to reliably identify heavily distorted text, obscure image objects, or subtle slider gaps embedded within complex backgrounds. AI can easily misinterpret visual cues, and a single error can disrupt the entire workflow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Layered anti-bot defense mechanisms:&lt;/strong&gt; Modern CAPTCHAs extend beyond simple front-end challenges. Websites actively monitor mouse trajectories, typing rhythms, page dwell time, and even browser fingerprints. If the system detects behavior inconsistent with human interaction, the CAPTCHA difficulty can instantly intensify—escalating from a simple checkbox verification to requiring the resolution of ten consecutive image-recognition tasks.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Time sensitivity and contextual disruption:&lt;/strong&gt; CAPTCHAs typically come with strict expiration limits. If an Agentic Browser becomes stalled on a CAPTCHA for an extended period during a multi-step operation, login sessions may expire, products might sell out, and the entire task chain can collapse. This is akin to a sudden bridge collapse on a highway, bringing the entire automation pipeline to a standstill.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In essence, without the capacity to overcome CAPTCHAs, Agentic Browsers are confined to navigating the "unprotected byways" of the web, rather than fully traversing the comprehensive network of real-world websites. This fundamental need is precisely why CAPTCHA-solving infrastructures, such as CapSolver, are indispensable.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 How CapSolver Facilitates AI Agent Operations
&lt;/h3&gt;

&lt;p&gt;CapSolver is not a tool intended for general users; rather, it functions as a specialized "CAPTCHA engine" deeply embedded within developers’ toolkits. Fundamentally, it is an intelligent CAPTCHA-solving platform that offers API interfaces specifically engineered to assist automation programs and AI agents in managing diverse CAPTCHA types.&lt;/p&gt;

&lt;p&gt;We can conceptualize it as a perpetually available CAPTCHA-solving team that operates tirelessly and with exceptional speed—its "team members" comprising not only sophisticated AI models but also highly optimized strategic algorithms.&lt;/p&gt;

&lt;p&gt;To better comprehend its capabilities, the following comparison highlights the distinctions between traditional approaches and CapSolver when confronted with identical CAPTCHA challenges:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Comparison Dimension&lt;/th&gt;
&lt;th&gt;Local OCR / Simple Models&lt;/th&gt;
&lt;th&gt;Human CAPTCHA-Solving Platforms&lt;/th&gt;
&lt;th&gt;CapSolver&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Supported CAPTCHA Types&lt;/td&gt;
&lt;td&gt;Limited to simple text CAPTCHAs; largely ineffective for image selection&lt;/td&gt;
&lt;td&gt;Theoretically supports all types, but characterized by slowness and high cost&lt;/td&gt;
&lt;td&gt;Encompasses mainstream &lt;a href="https://www.capsolver.com/blog/The-other-captcha/what-are-captchas" rel="noopener noreferrer"&gt;CAPTCHA types&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Recognition Speed&lt;/td&gt;
&lt;td&gt;Milliseconds, but with low success rates&lt;/td&gt;
&lt;td&gt;5–15 seconds per attempt&lt;/td&gt;
&lt;td&gt;1–3 seconds per attempt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Success Rate&lt;/td&gt;
&lt;td&gt;Low (diminishes with complex CAPTCHAs)&lt;/td&gt;
&lt;td&gt;Relatively high, yet susceptible to worker fatigue and network latency&lt;/td&gt;
&lt;td&gt;Consistently high and stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost Structure&lt;/td&gt;
&lt;td&gt;One-time development expenditure&lt;/td&gt;
&lt;td&gt;Pay-per-task with substantial labor costs&lt;/td&gt;
&lt;td&gt;Pay-per-task with competitive pricing and low marginal costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anti-Detection Capability&lt;/td&gt;
&lt;td&gt;Virtually nonexistent&lt;/td&gt;
&lt;td&gt;Incapable of handling behavioral analysis systems&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.capsolver.com/blog/web-scraping/integrate-ai-scraping-workflow" rel="noopener noreferrer"&gt;Integrates with browser environments to provide risk-compliant tokens or instructions&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1-1: Comparison of Traditional CAPTCHA-Solving Methods and CapSolver Capabilities&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The core operational principle of CapSolver is essentially "AI versus AI, strategy versus strategy." For distinct CAPTCHA categories, it employs specialized resolution pipelines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.capsolver.com/blog/The-other-captcha/how-to-solve-captcha-images-quickly" rel="noopener noreferrer"&gt;Image and text recognition CAPTCHAs:&lt;/a&gt;&lt;/strong&gt; Utilizing proprietary vision models combined with extensive training datasets, CapSolver can accurately decipher heavily distorted, overlapping, or noisy text.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Slider and puzzle CAPTCHAs:&lt;/strong&gt; Instead of merely outputting gap coordinates, it generates fluid movement trajectories based on environmental analysis, simulating the subtle hand tremors, acceleration, and deceleration patterns characteristic of human touch interactions. These behavioral parameters enable automation programs to drag sliders naturally through the verification process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Token-based verification systems (&lt;a href="https://www.capsolver.com/blog/reCAPTCHA/recaptcha-v2-vs-recaptcha-v3" rel="noopener noreferrer"&gt;reCAPTCHA v2/v3&lt;/a&gt;, Cloudflare, etc.):&lt;/strong&gt; These CAPTCHAs do not demand explicit user input. Instead, they evaluate browser behavior in the background and issue a one-time token. CapSolver integrates browser fingerprints, IP reputation, mouse trajectories, and other contextual data to acquire valid verification tokens via dedicated solving interfaces. The Agentic Browser then simply injects the token into the webpage to achieve verification.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So, how do CapSolver and Agentic Browsers collaborate in practice? The following diagram illustrates the complete process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnseytlv2eapbfrazpwhe.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnseytlv2eapbfrazpwhe.png" alt="Figure 1-2 CapSolver Architecture Diagram" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From the moment the browser dispatches a request to a website, encounters a CAPTCHA, captures screenshots, invokes the CapSolver API, receives a token or behavioral trajectory, submits the verification, and resumes the original task—the entire workflow is seamlessly integrated and typically concludes within 1–2 seconds.&lt;/p&gt;

&lt;p&gt;This implies that for Agentic Browsers, CAPTCHAs are no longer problems that AI itself must "discern" and "deduce." Instead, they become standardized tasks outsourced to specialized infrastructure providers. The browser merely needs to capture the challenge, package the context, transmit it, await the "solution," and continue its journey.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3 The Collaborative Workflow Between Agentic Browsers and CapSolver
&lt;/h3&gt;

&lt;p&gt;Let us now connect the dynamic adaptation module of an Agentic Browser with CapSolver and examine their seamless collaboration in overcoming obstacles.&lt;/p&gt;

&lt;p&gt;While the Agentic Browser is executing tasks, its environmental perception layer continuously monitors the webpage. Upon detecting a CAPTCHA element (for instance, a popup containing a reCAPTCHA iframe), action execution immediately pauses and initiates a dedicated CAPTCHA-handling sub-process.&lt;/p&gt;

&lt;p&gt;This process is highly sophisticated and generally involves the following steps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Context Collection:&lt;/strong&gt; The Agentic Browser captures screenshots of the CAPTCHA region and gathers pertinent contextual information, such as the current URL, sitekey, browser viewport dimensions, and User-Agent.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Task Submission:&lt;/strong&gt; The screenshots and parameters are bundled and transmitted to CapSolver via API, specifying the CAPTCHA type.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Background Resolution:&lt;/strong&gt; Upon receiving the task, CapSolver routes it through the appropriate solving pipeline. For example, when encountering reCAPTCHA v2, it activates a specialized solver to return a valid &lt;code&gt;g-recaptcha-response&lt;/code&gt; token. The entire resolution process typically completes within 1–2 seconds.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Instruction Return:&lt;/strong&gt; The Agentic Browser receives the generated result—which may be a token string or a set of mouse trajectory coordinates.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;On-Site Execution:&lt;/strong&gt; The Agentic Browser inserts the token into hidden form fields and submits the form, or simulates human-like slider movement according to the returned trajectory data. The CAPTCHA layer then vanishes, and the original task flow resumes seamlessly.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;State Verification:&lt;/strong&gt; The browser confirms whether the page has successfully passed validation and whether the target elements have reappeared before proceeding with the interrupted workflow.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It is crucial to acknowledge that modern CAPTCHAs manifest in numerous forms with varying degrees of complexity. The following diagram categorizes mainstream CAPTCHA types and indicates their corresponding complexity levels:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mo7nvdsulhboumota4p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4mo7nvdsulhboumota4p.png" alt="Figure 3-3 Multi-Pipeline CAPTCHA Solving Engine Illustration" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For end-users, this entire process remains completely transparent. Within the Agentic Browser’s task log, users might only observe a concise message such as:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“reCAPTCHA v2 detected. Automatically resolved in 1.2 seconds.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;An impediment that would have previously halted the entire automation workflow is now silently overcome in the background.&lt;/p&gt;

&lt;p&gt;This also signifies a pivotal advancement in AI-agent capabilities: the agent is no longer deterred by defensive systems specifically engineered to obstruct automation. With CAPTCHA-solving infrastructure functioning as an "unseen mechanism," Agentic Browsers finally acquire the operational autonomy required to execute tasks across the open Internet.&lt;/p&gt;

&lt;p&gt;Without this essential mechanism, all promises surrounding intelligent agents could easily falter at the very first CAPTCHA encounter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chapter 2: Contemporary Applications of Agentic Browsers
&lt;/h2&gt;

&lt;p&gt;If the preceding chapters made this technology seem somewhat abstract, the subsequent examples may entirely alter your perception. Agentic Browsers are not merely theoretical concepts; they are rapidly being deployed across three primary domains: personal productivity, enterprise automation, and data collection. In each of these areas, they are addressing practical challenges at various levels.&lt;/p&gt;

&lt;p&gt;The following diagram summarizes the core application scenarios of Agentic Browsers:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbej6apywq3kstz4iziwy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbej6apywq3kstz4iziwy.png" alt="Figure 4-1 Overview of the Three Major Application Scenarios for Agentic Browsers" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The utility of Agentic Browsers extends from individual users to large enterprises, and from routine daily tasks to specialized research workflows. In the realm of personal productivity, they assist users with travel bookings, repetitive form filling, and monitoring product price fluctuations. Within enterprise automation, they manage financial reconciliation, employee onboarding, and competitor tracking. For data collection and research, they serve as tireless crawlers and intelligent analysis assistants.&lt;/p&gt;

&lt;p&gt;Next, we will explore these three scenarios in detail to understand how Agentic Browsers effectively "get work done."&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 Personal Productivity: Intelligent Delegation of Everyday Tasks
&lt;/h3&gt;

&lt;p&gt;For the average user, the most immediate benefit of an Agentic Browser is straightforward: &lt;strong&gt;time savings&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Daily, individuals perform countless repetitive and multi-step online tasks within browsers. These tasks typically share three characteristics:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  The objective is unambiguous.&lt;/li&gt;
&lt;li&gt;  The rules are consistent.&lt;/li&gt;
&lt;li&gt;  The operations are tedious.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agentic Browsers excel at undertaking precisely these types of tasks—situations where users know what they want accomplished but prefer not to execute the operations manually.&lt;/p&gt;

&lt;p&gt;In personal productivity contexts, Agentic Browsers can provide assistance with the following typical tasks:&lt;/p&gt;

&lt;h4&gt;
  
  
  Automated Booking and Purchasing
&lt;/h4&gt;

&lt;p&gt;This includes tasks such as booking flights, hotels, or acquiring limited-release products. Users simply need to articulate their requirements in natural language—such as time, preferences, or budget—and the Agentic Browser can autonomously compare prices across various websites, filter options, populate information, and present the most favorable outcome.&lt;/p&gt;

&lt;h4&gt;
  
  
  Cross-Website Information Integration and Form Completion
&lt;/h4&gt;

&lt;p&gt;Tasks like visa applications, academic admissions, or expense reimbursements frequently demand that users repeatedly input identical information across multiple forms.&lt;/p&gt;

&lt;p&gt;An Agentic Browser functions as an "information manager" by securely retaining user data, automatically identifying form fields, and intelligently mapping them. For instance, it can automatically segment a full name into "First Name" and "Last Name."&lt;/p&gt;

&lt;h4&gt;
  
  
  Daily Information Monitoring
&lt;/h4&gt;

&lt;p&gt;Agentic Browsers can monitor product inventory, price changes, or new product announcements in the background. Once predefined conditions are met—such as a price reduction or a restock event—the browser promptly notifies the user or can even proceed to place an order automatically.&lt;/p&gt;

&lt;p&gt;To better illustrate the transformation in user experience, consider the contrast between traditional workflows and Agentic Browser workflows. For tasks like comparing and booking a flight, a traditional workflow might take 15–30 minutes of manual browsing across multiple websites, whereas an Agentic Browser can complete it in 1 minute by simply describing requirements and confirming recommendations, transforming the user from an executor to a decision-maker. Similarly, filling out complex online forms, which traditionally consumes 20–40 minutes of repetitive data entry, can be reduced to 2 minutes with an Agentic Browser, where the user primarily reviews autofill results, shifting their role from data-entry operator to reviewer. Monitoring product restocks or price drops, an extremely time-consuming manual process, becomes a 0-minute background task with automatic notifications, changing the user's role from monitor to receiver. Lastly, cross-platform data organization, typically requiring 1–2 hours of manual copy-pasting and formatting, is streamlined to 5 minutes through automatic extraction and formatting, transforming the user from a manual operator to an analyst.&lt;/p&gt;

&lt;p&gt;As demonstrated, the Agentic Browser effectively serves as a personal assistant. It liberates users from the role of "workflow operators" and transforms them into "goal setters" and "outcome reviewers."&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Enterprise Automation: Intelligent Coordination Across Systems
&lt;/h3&gt;

&lt;p&gt;If enhancements in personal productivity are about "reducing individual effort," then the value of Agentic Browsers in enterprise environments lies in &lt;strong&gt;connectivity&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Large organizations frequently depend on numerous disparate legacy systems, SaaS platforms, and supplier portals that resist straightforward integration via APIs. Employees are often compelled to act as "human bridges," manually transferring information between systems repeatedly.&lt;/p&gt;

&lt;p&gt;This is precisely where Agentic Browsers exhibit their most significant advantages.&lt;/p&gt;

&lt;h4&gt;
  
  
  Typical Enterprise Applications
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Financial and Supply Chain Reconciliation&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An Agentic Browser can autonomously log into banking portals, download statements, reconcile them against ERP systems, generate discrepancy reports, and even compose notification emails.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Comprehensive Employee Onboarding Workflows&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Organizations can predefine onboarding task packages. The Agentic Browser automatically creates accounts across HR systems, IT systems, mailing lists, and access-control systems, ensuring complete coverage and timely execution.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Competitor Monitoring and Market Intelligence&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agentic Browsers can function as "market surveillance" systems by automatically visiting competitor websites, e-commerce platforms, and social-media pages, identifying critical information changes, and storing them in structured databases.&lt;/p&gt;

&lt;p&gt;To better illustrate the distinct positioning of Agentic Browsers in enterprise automation, consider a comparison with manual operations and traditional API integrations. For applicable systems, manual operations can handle any system, API integration is limited to systems with open APIs, while Agentic Browsers can work with any web-based system, including legacy internal systems. In terms of deployment cycle, manual operations require no development but are time-consuming, API integration takes weeks to months, whereas Agentic Browsers can be configured in hours to days. Flexibility is high for manual operations (humans adapt), low for API integration (requires rewrites), and high for Agentic Browsers (AI adapts dynamically). CAPTCHA/Login handling is manual for human operations, difficult for API integration, and seamlessly automatic for Agentic Browsers. Scalability is poor for manual operations, extremely strong for API integration, and strong for Agentic Browsers (parallel execution). Typical failure scenarios include human fatigue for manual operations, API rate limits for API integration, and potential human confirmation needs in extremely chaotic page conditions for Agentic Browsers.&lt;/p&gt;

&lt;p&gt;As indicated, Agentic Browsers are not intended to supersede APIs. Instead, they offer a lightweight integration layer in scenarios where APIs are unavailable or prohibitively expensive to implement.&lt;/p&gt;

&lt;p&gt;By harnessing the flexibility and adaptability of AI, Agentic Browsers bridge the gaps left by conventional automation approaches, enabling enterprises to achieve intelligent cross-system coordination without undertaking extensive re-engineering of legacy infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Data Collection and Research: From Manual Gathering to Intelligent Extraction
&lt;/h3&gt;

&lt;p&gt;Data is frequently described as the lifeblood of the digital era, yet the efficient collection of clean public web data has consistently presented challenges.&lt;/p&gt;

&lt;p&gt;Traditional web crawlers rely on fixed parsing rules. Should target websites undergo layout redesigns or implement anti-scraping measures, these crawlers often become entirely ineffective. Academic researchers, market research firms, and investigative journalism teams frequently require the extraction of specific information from vast quantities of heterogeneous webpages, rendering traditional methods costly and time-intensive.&lt;/p&gt;

&lt;p&gt;Agentic Browsers introduce an entirely novel paradigm for data collection:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A transition from extraction based on "code rules" to extraction based on "semantic objectives."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Their workflow generally operates as follows:&lt;/p&gt;

&lt;p&gt;Researchers articulate the required data dimensions and sample ranges using natural language. For example:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;“Extract product titles, prices, ratings, and review counts from the top 100 e-commerce product pages while excluding sponsored products.”&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The Agentic Browser autonomously navigates webpages, identifies relevant information blocks through environmental perception, intelligently extracts and structures the data, and manages complex interactions such as pagination, infinite scrolling, and popups.&lt;/p&gt;

&lt;p&gt;When target websites redesign their layouts, traditional crawlers often fail immediately. In contrast, Agentic Browsers attempt to visually relocate information and continue execution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42n1aqx1nnwtjcpe0o7t.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F42n1aqx1nnwtjcpe0o7t.png" alt="Figure 4-2 Intelligent Data Collection Workflow" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This methodology introduces several fundamental enhancements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Elimination of Parsing Rule Maintenance&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI comprehends the semantic meaning of a "price" rather than depending on fixed HTML class names.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Enhanced Robustness Against Website Redesigns&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Minor layout modifications no longer immediately disrupt extraction pipelines.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Capability to Handle Complex Interactions&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For websites necessitating login, infinite scrolling, or tab switching, Agentic Browsers can interact with the interface akin to real users before extracting information.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Reproducible Research Workflows&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Task configurations can be saved and shared, thereby standardizing and ensuring the reproducibility of data collection.&lt;/p&gt;

&lt;p&gt;To further illustrate the resilience advantages of Agentic Browsers in data collection tasks, the following figure compares traditional crawlers and Agentic Browsers after multiple website redesigns:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fassets.capsolver.com%2Fprod%2Fposts%2Fagentic-browser-capsolver%2FAxpRq748Bivq-c460e567dc7ccffe00db16d97c1413a1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fassets.capsolver.com%2Fprod%2Fposts%2Fagentic-browser-capsolver%2FAxpRq748Bivq-c460e567dc7ccffe00db16d97c1413a1.png" alt="Figure 4-3 Traditional Crawlers vs. Agentic Browser Data Collection Resilience Comparison" width="800" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Traditional crawlers experience a dramatic decline in success rates after the initial website redesign, whereas Agentic Browsers maintain relatively high extraction success rates even after multiple redesigns, owing to their visual localization and semantic understanding capabilities.&lt;/p&gt;

&lt;p&gt;This inherent resilience makes them exceptionally suitable for long-term, large-scale data collection projects.&lt;/p&gt;

&lt;p&gt;For example, envision a social-science research team requiring a comparison of specific policy clauses across 200 policy websites spanning 30 countries. Traditionally, this would necessitate research assistants spending months manually copying and organizing information.&lt;/p&gt;

&lt;p&gt;Now, researchers can configure an Agentic Browser task that autonomously traverses these websites, locates policy pages containing target keywords, extracts the relevant clauses, and categorizes them automatically.&lt;/p&gt;

&lt;p&gt;Researchers then only need to review and analyze the compiled results, allowing valuable human effort to be directed towards actual "research" rather than repetitive "manual data transfer."&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The Agentic Browser represents not merely a new product, but an entirely novel philosophy for engaging with the online world. Its fundamental premise is that the browser should transcend its role as a mere interface awaiting user clicks, evolving instead into an intelligent agent that comprehends your intentions and assists in task completion. From a technical implementation standpoint, it leverages the reasoning prowess of large language models for task planning, multi-modal perception for webpage comprehension, a real browser environment for operation execution, and infrastructure like &lt;strong&gt;&lt;a href="https://www.capsolver.com/?utm_source=official&amp;amp;utm_medium=blog&amp;amp;utm_campaign=agentic-browser-capsolver" rel="noopener noreferrer"&gt;CapSolver&lt;/a&gt;&lt;/strong&gt; to overcome automation hurdles. The convergence of these technologies is transforming the "information window" we have utilized for three decades into a genuine "action platform."&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q1: Why can't general AI models independently resolve CAPTCHAs?&lt;/strong&gt;&lt;br&gt;
A1: While general AI models possess considerable power, CAPTCHAs are specifically designed to be adversarial and are subject to constant modification. Reliable and rapid resolution necessitates specialized infrastructure, such as CapSolver, which is exclusively dedicated to this singular task.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q2: How does CapSolver support Agentic Browsers?&lt;/strong&gt;&lt;br&gt;
A2: CapSolver functions as an "unseen mechanism" that manages CAPTCHA challenges via a straightforward API. This enables the Agentic Browser to seamlessly bypass security obstacles and continue its tasks without human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q3: Will Agentic Browsers displace human employment?&lt;/strong&gt;&lt;br&gt;
A3: They are engineered to automate "tasks," not to eliminate "jobs." By undertaking repetitive digital labor, they liberate humans to concentrate on higher-level creativity and strategic decision-making.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q4: How can I begin utilizing an Agentic Browser today?&lt;/strong&gt;&lt;br&gt;
A4: Numerous experimental browsers and extensions are currently available. However, for an optimal experience, ensure that you integrate a dependable CAPTCHA-solving service like CapSolver to effectively navigate the web's security challenges.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>agentskills</category>
    </item>
    <item>
      <title>What Is an Agentic Browser? How AI Browsers Work Proactively for Users</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Tue, 26 May 2026 09:49:09 +0000</pubDate>
      <link>https://dev.to/luisgustvo/what-is-an-agentic-browser-how-ai-browsers-work-proactively-for-users-20hf</link>
      <guid>https://dev.to/luisgustvo/what-is-an-agentic-browser-how-ai-browsers-work-proactively-for-users-20hf</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frysrpw2in1moi002rv5p.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frysrpw2in1moi002rv5p.png" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Consider this scenario: you spend an hour meticulously booking a flight, constantly comparing prices and filling out numerous forms. In stark contrast, an Agentic Browser can accomplish this task in mere minutes with a simple command: "Book me a window seat for a flight from Beijing to Shanghai this Friday afternoon." It transcends its traditional role as a mere display tool, evolving into an intelligent agent capable of comprehending user intent and autonomously executing complex tasks. Over the past two years, this concept has progressed significantly towards commercialization, with Google Chrome introducing Auto Browse and Opera launching Opera Neon. This article aims to provide an accessible overview of how Agentic Browsers function and highlight the crucial role played by foundational infrastructure, such as &lt;a href="https://www.capsolver.com/?utm_source=official&amp;amp;utm_medium=blog&amp;amp;utm_campaign=agentic-browser" rel="noopener noreferrer"&gt;CapSolver&lt;/a&gt;, within this evolving ecosystem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chapter 1: Reimagining the Browser—From a 'Display Tool' to an 'Action Agent'
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 The Role and Limitations of Conventional Browsers
&lt;/h3&gt;

&lt;p&gt;Since its inception in the 1990s, the fundamental purpose of web browsers has consistently revolved around the "presentation and interaction of information." Essentially, a browser operates as a passive rendering engine: users provide instructions, and the browser interprets the &lt;a href="https://www.capsolver.com/glossary/dom" rel="noopener noreferrer"&gt;DOM&lt;/a&gt; to deliver visual feedback. In this unidirectional, "human-operates-machine" model, the browser faithfully serves as a "window" into the digital realm.&lt;/p&gt;

&lt;p&gt;However, as the complexity of web applications has expanded exponentially, the inherent limitations of conventional browsers have become increasingly apparent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Excessive Cognitive Burden&lt;/strong&gt;: Users are often compelled to manually locate desired elements amidst a deluge of tabs, pop-ups, and intricate menus, expending considerable mental effort on "finding controls" rather than "achieving objectives."&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Inability to Automate Repetitive Processes&lt;/strong&gt;: High-frequency operations, such as cross-platform data transfers, bulk form submissions, and multi-stage approvals, largely continue to depend on manual copy-pasting or laborious script configurations.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Contextual Disconnect&lt;/strong&gt;: The browser lacks awareness of your immediate past actions or your future intentions. Each interaction is treated as an isolated event, devoid of continuous task-level memory.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The Conundrum of Security Versus User Experience&lt;/strong&gt;: To combat bot activity, websites frequently implement extensive CAPTCHAs, bot detection mechanisms, and dynamic loading, which inadvertently escalate operational friction for human users.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To more clearly delineate the deficiencies of traditional browsers, we can categorize them across dimensions such as interaction modality, task comprehension, and process continuity, as illustrated in the table below:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Traditional Browser&lt;/th&gt;
&lt;th&gt;Key Challenges / Constraints&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Interaction Mode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Driven by mouse/keyboard, step-by-step operations&lt;/td&gt;
&lt;td&gt;Fragmented actions, reduced efficiency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Task Understanding&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Interprets only URLs and DOM structure, lacks intent recognition&lt;/td&gt;
&lt;td&gt;Incapable of processing natural language commands&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Process Continuity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Stateless; cross-page/site navigation requires manual linking&lt;/td&gt;
&lt;td&gt;Loss of context, multi-step tasks prone to interruption&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automation Capability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Relies on extensions or external scripts (e.g., &lt;a href="https://www.capsolver.com/glossary/selenium" rel="noopener noreferrer"&gt;Selenium&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;High setup complexity, vulnerable to interference&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Environmental Awareness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Static rendering, cannot interpret visual semantics&lt;/td&gt;
&lt;td&gt;Ineffective against dynamic content, CAPTCHAs, and anti-scraping measures&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Table 1-1: Performance and Limitations of Traditional Browsers Across Dimensions&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In essence, conventional browsers excel at "displaying content based on instructions" but fall short in "understanding tasks and offering proactive assistance." This passive, fragmented, and stateless characteristic represents the core challenge that Agentic Browsers are designed to address.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Defining the Agentic Browser: A Browser That Can 'Act' on Your Behalf
&lt;/h3&gt;

&lt;p&gt;An Agentic Browser is not merely an enhanced version of a traditional browser; it represents a next-generation interaction platform that profoundly integrates &lt;a href="https://www.capsolver.com/glossary/llm" rel="noopener noreferrer"&gt;LLM&lt;/a&gt; capabilities with the browser's core engine. Its fundamental definition can be summarized as: a digital action agent endowed with the ability to understand intent, perceive its environment, plan autonomously, and execute tasks.&lt;/p&gt;

&lt;p&gt;If a conventional browser is the "screen you observe," an Agentic Browser is akin to a "digital assistant working for you." It no longer awaits step-by-step user clicks but directly accepts natural language directives (e.g., "Transcribe last week's meeting recording, summarize it, and email it to the project team"). Subsequently, it autonomously performs a sequence of operations within the browser environment, such as launching applications, locating files, invoking AI tools, editing documents, and dispatching emails.&lt;/p&gt;

&lt;p&gt;Its operational foundation rests upon a comprehensive agent architecture. Figure 1-1 graphically depicts the primary modules and data flow within this architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs5pxtz9snfeyy266xsj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffs5pxtz9snfeyy266xsj.png" alt="Figure 1-1: Agentic Browser Technical Architecture Diagram" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The architecture comprises four essential layers, progressing from top to bottom (or sequentially):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;AI Intent &amp;amp; Task Planner&lt;/strong&gt;: This component dissects ambiguous natural language inputs into actionable, atomic operation sequences and anticipates potential decision branches.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;DOM/Environment Perception&lt;/strong&gt;: It continuously "reads" the structure of the webpage in real-time, combining this with multi-modal visual recognition to discern button functionalities, form semantics, and changes in page state.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Action Executor&lt;/strong&gt;: This module precisely emulates human interactions (such as clicking, typing, scrolling, file uploading) via underlying &lt;a href="https://www.capsolver.com/faq/ai-and-automation/how-to-combine-llms-with-browser-automation" rel="noopener noreferrer"&gt;browser automation&lt;/a&gt; protocols and securely interfaces with external APIs.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Result Verification &amp;amp; Feedback Loop&lt;/strong&gt;: It automatically confirms whether the outcome of each step aligns with expectations. Should an error or page alteration occur, it dynamically adjusts its strategy and attempts a retry, thereby achieving "self-correction."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Through this architectural framework, the Agentic Browser translates the user's overarching intent into granular browser operations, truly embodying the principle of "you articulate the goal, and it handles the execution."&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3 From Passive to Proactive: A Fundamental Transformation in Browser Paradigm
&lt;/h3&gt;

&lt;p&gt;The advent of the Agentic Browser signifies a profound shift in the human-computer interaction paradigm. This transformation extends beyond mere efficiency gains; it represents a re-evaluation of control mechanisms and interaction logic.&lt;/p&gt;

&lt;p&gt;In the conventional model, humans are required to conform to machine logic: mastering intricate menu hierarchies, memorizing shortcuts, and manually addressing unexpected pop-ups. In the &lt;strong&gt;Agentic mode&lt;/strong&gt;, the machine begins to adapt to human logic: understanding conversational instructions, anticipating user intentions, and proactively coordinating tasks across various applications.&lt;/p&gt;

&lt;p&gt;To more clearly illustrate the distinction between these two modes, the figure below presents a comparative analysis of interaction roles between traditional passive browsers and agentic proactive browsers:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9s2a2cx6wpm0g71lbscg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9s2a2cx6wpm0g71lbscg.png" alt="Figure 1-2: Traditional vs. Agentic Browser — Interaction Paradigm Comparison" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This paradigm shift is evident across three critical dimensions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;From "Instruction-Driven" to "Goal-Driven"&lt;/strong&gt;: Users no longer need to concern themselves with "how" to perform an action (How), but solely define "what" needs to be accomplished (What). The browser then assumes responsibility for deconstructing high-level objectives into a sequence of low-level operations.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;From "Static Interface" to "Dynamic Collaboration"&lt;/strong&gt;: Webpages are no longer fixed UI layouts but rather "data streams" that can be parsed, reconfigured, and manipulated by AI in real-time. Agentic Browsers can seamlessly navigate diverse websites and systems, effectively dismantling data silos.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;From "Manual Fallback" to "Intelligent Fault Tolerance"&lt;/strong&gt;: When confronted with webpage redesigns, loading delays, or CAPTCHA obstructions, traditional scripts would typically fail. In contrast, Agentic Browsers possess contextual reasoning capabilities, enabling them to "explore alternative approaches" much like a human, thereby substantially reducing the maintenance overhead of automated processes.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the average user, this implies that the browser will evolve from a "time-consuming tool" into a "time-saving enabler." When the browser proactively undertakes tasks on your behalf, the focus of digital life can genuinely revert to creation, decision-making, and intellectual pursuits themselves.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 2: How Does an Agentic Browser Work?
&lt;/h2&gt;

&lt;p&gt;Take a moment to envision a scenario: You instruct an Agentic Browser, "Locate Sony WH-1000XM5 headphones on E-commerce Site A, select the black variant, identify the official store offering the lowest price, proceed with an order for next-day delivery, and opt for cash on delivery." This single directive encompasses a complex series of underlying events. The Agentic Browser must "comprehend" your requirements, break them down into executable steps, "perceive" the content on the webpage, "act" upon it, and manage unforeseen circumstances such as page modifications.&lt;/p&gt;

&lt;p&gt;The following diagram encapsulates the entire operational flow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjbsy0l6aw3vagwkesyl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdjbsy0l6aw3vagwkesyl.png" alt="Figure 2-1: The Four Stages of Agentic Browser Operation" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The complete process commences with the user's natural language instruction, progresses through intent understanding and task planning, and then transitions into the core phase of "environment perception and action execution." Significantly, a bidirectional loop exists between environment perception and action execution—the Agentic Browser monitors the page state during operation and subsequently perceives subsequent page changes based on the execution outcomes. Concurrently, "dynamic adaptation" permeates the entire process as a feedback mechanism, ensuring flexibility in adjusting strategies when encountering pop-ups, CAPTCHAs, or alterations in page structure. Next, we will meticulously examine each stage to elucidate how the Agentic Browser "understands, perceives, acts, and adapts."&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 Intent Understanding: From Natural Language to Task Planning
&lt;/h3&gt;

&lt;p&gt;When a casual statement is directed at the browser, it must first convert it into a clearly structured "task list." This constitutes the intent understanding stage.&lt;/p&gt;

&lt;p&gt;If you were to instruct a traditional browser to "buy headphones," it would likely only open a default search engine and input those exact words. An Agentic Browser, however, leverages Large Language Models (LLMs) for in-depth analysis. Its primary objective is not merely to search, but to decompose the task.&lt;/p&gt;

&lt;p&gt;Referring to the previous example, the AI needs to identify:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Target Product&lt;/strong&gt;: "Sony WH-1000XM5 headphones"&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Constraints&lt;/strong&gt;: "Black," "Lowest price," "Official store"&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Action Sequence&lt;/strong&gt;: Search for product → Filter for black → Sort by price → Locate official store → Add to cart → Input shipping address → Select delivery method (next-day) → Choose payment method (cash on delivery) → Confirm order&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Implicit Dependencies&lt;/strong&gt;: The user must be logged in, a valid address must be present in the address book, the payment method must support cash on delivery, etc.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This decomposition process is not a simplistic application of a template but necessitates contextual reasoning. For instance, it must ascertain which logistics option corresponds to "next-day delivery" and verify if the product is eligible for it. Ultimately, a task planning map is generated. The figure below illustrates the complete structure of this task in the form of a decision tree:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnn4uo6nm7wfcf5lx25vp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnn4uo6nm7wfcf5lx25vp.png" alt="Figure 2-2: Task Planning Schematic" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This decision tree transforms the user's natural language instruction into an executable operational tree. Commencing from the root node "Buy headphones," it progressively refines the task along the "Yes" branches, with each step incorporating conditional judgments (e.g., official store verification, credit score comparison) and atomic actions (e.g., search, filter, input). This structured task planning ensures the browser clearly comprehends "what to do first, what to do next, and how to make choices when encountering divergent paths." From this juncture, the browser ceases to be a mere search box and becomes an executor venturing into the web with a defined objective.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Environment Perception: How AI 'Views' the Web
&lt;/h3&gt;

&lt;p&gt;With a plan established, the subsequent step involves enabling the AI to "perceive" the vibrant webpage akin to a human. This is technically termed environment perception. Conventional automation scripts depend on element positioning (CSS selectors, XPath), which is inherently fragile—a change in a webpage's class can render them inoperable. Agentic Browsers employ a multi-perception fusion approach, effectively possessing both visual and tactile senses.&lt;/p&gt;

&lt;p&gt;The three levels of perception are summarized in the table below:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Level&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;th&gt;Technical Implementation&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DOM Structure &amp;amp; Semantic Analysis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Interprets the webpage's Document Object Model, extracting tags, roles, and text, augmented by ARIA accessibility labels to understand element functions.&lt;/td&gt;
&lt;td&gt;HTML parsing, semantic labeling&lt;/td&gt;
&lt;td&gt;Can distinguish "this is a button" from "that is an input field," recognizing which div element actually facilitates the "Add to Cart" action.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Visual Screenshot Interpretation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Captures a screenshot of the current viewport and utilizes multi-modal models to analyze pixels, thereby understanding layout and visual relationships in a human-like manner.&lt;/td&gt;
&lt;td&gt;Computer vision, image segmentation&lt;/td&gt;
&lt;td&gt;Even if a button's HTML tag is unconventional, as long as its appearance suggests a button (e.g., rounded corners, distinct color block, text), it can be identified.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Interaction State Inference&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ascertains the current condition of components through CSS styles, focus states, disabled attributes, and similar indicators.&lt;/td&gt;
&lt;td&gt;Style analysis, state detection&lt;/td&gt;
&lt;td&gt;Can determine if a button is grayed out and inactive or highlighted and ready for interaction; whether a dropdown menu is collapsed or expanded.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Table 2-1: The Three Levels of Environment Perception&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;These three perceptual modalities do not operate in isolation but function concurrently and cross-validate each other. Figure 2-3 visually illustrates this fusion process:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0dyp6uhb1tbz0nf3kgi.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd0dyp6uhb1tbz0nf3kgi.png" alt="Figure 2-3: How AI Understands Webpages" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;At any given moment, the Agentic Browser reads the DOM tree (structure), analyzes the heatmap (visual representation), and delineates interaction boxes (interactive elements). These three aspects converge to form a "holistic understanding" of the webpage. It is this redundant design, where "vision is relied upon if code is not comprehended," that bestows Agentic Browsers with exceptional robustness. When a webpage modifies "Buy Now" to "Grab Now," or transforms a button into an elaborate image link, it can still precisely locate and execute the intended operation.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Action Execution: Performing Operations in a Live Browser
&lt;/h3&gt;

&lt;p&gt;With the task plan and environmental comprehension in place, the moment for action arrives. The action execution phase is responsible for translating abstract "steps" into atomic operations within a live browser: clicking, typing, scrolling, hovering, managing pop-ups, and so forth.&lt;/p&gt;

&lt;p&gt;Agentic Browsers typically operate within a controlled, real browser instance (such as headful or headless Chromium), simulating human actions through browser automation protocols (like CDP). However, they exhibit greater intelligence than conventional automation due to &lt;strong&gt;biomimetic execution&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Rhythm Management&lt;/strong&gt;: Introducing randomized delays between clicks and simulating character-by-character typing instead of instantaneous pasting effectively circumvents detection by website anti-automation mechanisms.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Mouse Trajectory Simulation&lt;/strong&gt;: Instead of instantaneous linear movement, it generates a Bezier curve path with subtle jitters, mirroring the natural motion of a human hand.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Intelligent Waiting&lt;/strong&gt;: Rather than employing a crude fixed &lt;code&gt;sleep&lt;/code&gt; duration, it monitors for events such as DOM changes and network activity.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To more clearly illustrate the complete action sequence of a typical interaction, Figure 2-4 uses "Click Add to Cart" as an example to delineate the detailed steps of action execution:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F48bzni8d47fiiv2e8y4q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F48bzni8d47fiiv2e8y4q.png" alt="Figure 2-4: Action Execution Sequence Diagram" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As depicted in Figure 2-4, each step aligns with the operational habits of a real user: from hovering to trigger visual feedback, to awaiting the backend response post-click, and finally verifying the frontend state change. This granular sequence design enables the Agentic Browser not only to "perform the correct action" but also to "act in a human-like manner."&lt;/p&gt;

&lt;p&gt;Furthermore, the entire process generates a real-time action log, empowering users to pause, inquire about progress, or rectify errors at any point. The Agentic Browser is not a one-off, run-to-completion tool but rather a human-machine collaborative "semi-automatic" mode—allowing intervention at crucial decision points, such as instructing the browser to halt and await confirmation before final payment. The concept of "Biomimetic Execution: Simulating Real Human Operational Rhythm" encapsulates the philosophy underpinning this series of actions: imbuing every machine operation with a touch of human nuance.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.4 Dynamic Adaptation: When Webpages Evolve
&lt;/h3&gt;

&lt;p&gt;Real-world webpages are dynamic entities: A/B tests might present a blue button one instance and a red one the next; page layouts can undergo significant alterations during promotional periods; "Claim Coupon" modals or CAPTCHA challenges may unexpectedly appear. This is precisely where Agentic Browsers diverge from conventional &lt;a href="https://www.capsolver.com/faq/ai-and-automation/what-is-the-difference-between-ai-agents-and-rpa" rel="noopener noreferrer"&gt;RPA&lt;/a&gt;—through their &lt;strong&gt;dynamic adaptation capability&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Dynamic adaptation encompasses three levels of response:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Anomaly Detection &amp;amp; Recovery&lt;/strong&gt;: Should an anticipated element fail to appear (e.g., altered button text, failed selector), the system promptly switches to a visual positioning mode or expands its search scope to locate the semantically closest alternative target. Persistent failure triggers an error report and prompts user intervention.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Pop-up and Interruption Management&lt;/strong&gt;: The AI intelligently determines "whether this sudden occurrence should be dismissed," much like a human. For promotional pop-ups, it typically initiates a close action; for login expiration alerts, it triggers a re-login subtask.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;CAPTCHA Resolution (Pre-integration)&lt;/strong&gt;: Upon detecting a CAPTCHA (e.g., graphic slider, reCAPTCHA) on the page, the Agentic Browser pauses the current task and delegates the CAPTCHA scenario to a specialized "invisible engine"—which is the primary challenge addressed by CapSolver, the focus of our third chapter. Following successful resolution, it seamlessly resumes the original task flow.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We can conceptualize the entire adaptation process as a continuous self-correcting loop:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76ad397h7i6ca87eu5iq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F76ad397h7i6ca87eu5iq.png" alt="Figure 2-5: Dynamic Adaptation Closed Loop" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The entire closed loop centers on "task execution": when encountering a CAPTCHA, the system automatically invokes external solving resources, awaits the outcome, and then seamlessly resumes; when a pop-up appears, it identifies and manages it, subsequently returning to the main task flow. This mechanism complements the underlying "Intelligent Fault Tolerance Mechanism," ensuring that the Agentic Browser can successfully complete complex webpage processes that were previously "guaranteed to fail" without human oversight. It is this closed loop that empowers the Agentic Browser to embrace change and adapt like a human.&lt;/p&gt;

&lt;h2&gt;
  
  
  Authoritative External Sources
&lt;/h2&gt;

&lt;p&gt;For further insights into the evolution and technical landscape of Agentic Browsers and web automation, please consult the following authoritative resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://www.anthropic.com/news/3-5-models-and-computer-use" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Anthropic: Introducing Computer Use for Claude 3.5 Sonnet&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://blogs.opera.com/news/2025/05/opera-neon-first-ai-agentic-browser/" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Opera: Meet Opera Neon, the First AI Agentic Browser&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://snowplow.io/blog/what-is-an-agentic-browser" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Snowplow: What Is an Agentic Browser?&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The progression from conventional browsers to Agentic Browsers signifies a monumental transformation in our interaction with the digital realm. By integrating Large Language Models (LLMs), multimodal perception, and biomimetic execution, Agentic Browsers transcend their role as passive interfaces, becoming active, intelligent assistants capable of comprehending intricate intentions and navigating dynamic web environments. They undertake monotonous, repetitive tasks, thereby liberating human users to concentrate on higher-order decision-making and creative endeavors. Nevertheless, as these agents grow in sophistication, they inevitably encounter the ultimate gatekeepers of the web: CAPTCHAs. To fully realize the potential of Agentic Browsers, robust infrastructure is indispensable for seamlessly overcoming these obstacles.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommendation:&lt;/strong&gt; To ensure the uninterrupted operation of your Agentic Browser or automation scripts, free from the impediments of complex CAPTCHAs, we strongly advocate for the integration of &lt;strong&gt;CapSolver&lt;/strong&gt;. CapSolver offers a dependable, AI-driven infrastructure designed to effortlessly circumvent various CAPTCHA challenges, serving as the ideal "invisible engine" for your automated workflows.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  Bonus Code
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Redeem Your CapSolver Bonus Code&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Boost your automation budget instantly!&lt;br&gt;
Use bonus code &lt;strong&gt;CAP26&lt;/strong&gt; when topping up your CapSolver account to get an extra &lt;strong&gt;5% bonus&lt;/strong&gt; on every recharge — with no limits.&lt;br&gt;
Redeem it now in your &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=offcial&amp;amp;utm_medium=blog&amp;amp;utm_campaign=web-scraping-captcha-handling-2026" rel="noopener noreferrer"&gt;CapSolver Dashboard&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5yewvdqlwdtfpgpgh5s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fq5yewvdqlwdtfpgpgh5s.png" alt="bonus code" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Read the second part of this series:&lt;/strong&gt; &lt;a href="https://www.capsolver.com/blog/ai/agentic-browser-capsolver" rel="noopener noreferrer"&gt;Agentic Browser's Invisible Engine: Overcoming CAPTCHAs with Specialized Infrastructure&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Q1: What is the primary distinction between a conventional browser and an Agentic Browser?&lt;/strong&gt;&lt;br&gt;
A1: A conventional browser functions as a passive instrument that necessitates sequential manual input (clicks, typing) for navigation and task execution. An Agentic Browser, conversely, is an active digital agent that interprets natural language commands, independently plans tasks, and carries them out on your behalf.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q2: How does an Agentic Browser interpret actions on a web page?&lt;/strong&gt;&lt;br&gt;
A2: It employs a combination of DOM structure analysis, visual screenshot interpretation (utilizing computer vision), and interaction state inference to "perceive" and comprehend the web page in a manner similar to a human, thereby exhibiting high resilience to UI alterations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q3: Is an Agentic Browser capable of managing unexpected pop-ups or website changes?&lt;/strong&gt;&lt;br&gt;
A3: Yes, it incorporates dynamic adaptation capabilities. It can detect anomalies, intelligently handle unforeseen pop-ups, and adjust its execution strategy in real-time without crashing, unlike traditional automation scripts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Q4: What occurs when an Agentic Browser encounters a CAPTCHA?&lt;/strong&gt;&lt;br&gt;
A4: Upon CAPTCHA detection, the Agentic Browser temporarily suspends its current task and delegates the resolution process to specialized infrastructure, such as CapSolver. Once resolved, it seamlessly resumes the task.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
    </item>
    <item>
      <title>How to Integrate Hermes Agent with CapSolver for Seamless CAPTCHA Solving</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Mon, 18 May 2026 08:40:39 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-to-integrate-hermes-agent-with-capsolver-for-seamless-captcha-solving-55np</link>
      <guid>https://dev.to/luisgustvo/how-to-integrate-hermes-agent-with-capsolver-for-seamless-captcha-solving-55np</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mz39hxt2vcnxgms1hv7.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2mz39hxt2vcnxgms1hv7.jpeg" alt="Hermes Agent browser automation workflow integrated with CapSolver for automatic CAPTCHA solving" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When using AI agents for web browsing, &lt;strong&gt;CAPTCHAs&lt;/strong&gt; often stand as the most significant hurdle. These security measures can block agents, prevent form submissions, and halt automated tasks until a human steps in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;, developed by Nous Research, is a versatile, self-improving AI agent capable of running on everything from a basic $5 VPS to a powerful GPU cluster. It connects with you through familiar platforms like Telegram, Discord, Slack, WhatsApp, Signal, and email. While it can navigate websites, interact with buttons, and extract data, it still faces the common challenge of getting stuck on CAPTCHAs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;CapSolver&lt;/a&gt; provides a seamless solution to this problem. By integrating the CapSolver Chrome extension into the browser used by Hermes, CAPTCHAs are resolved &lt;strong&gt;automatically and silently&lt;/strong&gt; in the background. This setup requires no extra code, no manual API calls, and no complex prompt engineering.&lt;/p&gt;

&lt;p&gt;The best part? &lt;strong&gt;You don't even have to mention CAPTCHAs to your agent.&lt;/strong&gt; Simply instruct it to pause for a moment before submitting a form—by the time it proceeds, the CAPTCHA is already handled.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is Hermes Agent?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/NousResearch/hermes-agent" rel="noopener noreferrer"&gt;&lt;strong&gt;Hermes Agent&lt;/strong&gt;&lt;/a&gt; is an open-source autonomous tool from &lt;a href="https://nousresearch.com/" rel="noopener noreferrer"&gt;Nous Research&lt;/a&gt;. It operates on three core pillars: &lt;strong&gt;persistent memory&lt;/strong&gt; (retaining project details across sessions), &lt;strong&gt;autonomous skill development&lt;/strong&gt; (learning and repeating procedures from experience), and &lt;strong&gt;infrastructure flexibility&lt;/strong&gt; (deployable via VPS, Docker, serverless sandboxes, or local GPU setups).&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxvtk28oaee0k6p78s9v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxvtk28oaee0k6p78s9v.png" width="800" height="380"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Unified Gateway&lt;/strong&gt;: Access your agent through Telegram, Discord, Slack, WhatsApp, Signal, email, or a terminal interface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexible Model Support&lt;/strong&gt;: Use &lt;code&gt;hermes model&lt;/code&gt; to switch between 200+ models via OpenRouter, Nous Portal, NVIDIA NIM, or your own endpoints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-term Memory&lt;/strong&gt;: Utilizes FTS5 session search and LLM summarization to remember past interactions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skill Repository&lt;/strong&gt;: An evolving procedural memory system that follows the agentskills.io standard.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diverse Backends&lt;/strong&gt;: Supports seven terminal environments, including Local, Docker, SSH, and Vercel Sandbox.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integrated Browser&lt;/strong&gt;: Controls Chromium through Playwright and the Chrome DevTools Protocol.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Browser Tool
&lt;/h3&gt;

&lt;p&gt;Hermes utilizes a Chromium browser for tasks like navigation, DOM reading, and data scraping. Its browser implementation is unique because it offers &lt;strong&gt;five interchangeable providers&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Extension Support?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Browserbase&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Browser Use&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Firecrawl&lt;/td&gt;
&lt;td&gt;Cloud&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Camoufox&lt;/td&gt;
&lt;td&gt;Local (Stealth Firefox)&lt;/td&gt;
&lt;td&gt;✗&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CDP attach&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Local (Any Chromium)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;✓&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Cloud-based providers typically don't allow for custom extensions, and Camoufox is built on Firefox, making it incompatible with Chrome extensions. The ideal solution is the &lt;strong&gt;CDP attach&lt;/strong&gt; method, where Hermes connects to a Chromium instance you've already launched. This is where CapSolver excels.&lt;/p&gt;

&lt;p&gt;Unlike tools like &lt;a href="https://www.capsolver.com/blog/web-scraping/openclaw-capsolver" rel="noopener noreferrer"&gt;OpenClaw&lt;/a&gt; or &lt;a href="https://www.capsolver.com/blog/web-scraping/crawlee-capsolver" rel="noopener noreferrer"&gt;Crawlee&lt;/a&gt; which manage their own browser launches, Hermes allows you to &lt;strong&gt;provide your own Chrome instance with the extension already active&lt;/strong&gt;, connecting to it via the DevTools protocol.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is CapSolver?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;CapSolver&lt;/a&gt; is a premier CAPTCHA-solving platform that uses AI to bypass modern security challenges. It supports all major CAPTCHA types and offers rapid response times, making it easy to integrate into automated systems—whether through direct API calls or by &lt;strong&gt;running its Chrome extension within an agent's browser session.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Integration is Different
&lt;/h2&gt;

&lt;p&gt;Most CAPTCHA solutions involve writing code to handle API requests and token injections. This is the standard approach for tools like &lt;a href="https://www.capsolver.com/blog/All/how-to-integrate-puppeteer" rel="noopener noreferrer"&gt;Puppeteer&lt;/a&gt; or &lt;a href="https://www.capsolver.com/blog/All/how-to-integrate-playwright" rel="noopener noreferrer"&gt;Playwright&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Hermes + CapSolver approach is a paradigm shift:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Traditional Method (Code-Heavy)&lt;/th&gt;
&lt;th&gt;Hermes Method (Natural Language)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Create a &lt;code&gt;CapSolverService&lt;/code&gt; class&lt;/td&gt;
&lt;td&gt;Start Chrome with &lt;code&gt;--load-extension=...&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manage &lt;code&gt;createTask()&lt;/code&gt; and &lt;code&gt;getTaskResult()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Simply chat with your agent&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manually inject tokens via script&lt;/td&gt;
&lt;td&gt;The extension automates the process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Write logic for errors and retries&lt;/td&gt;
&lt;td&gt;Tell the agent to "wait a minute, then submit"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Specific code needed for each CAPTCHA&lt;/td&gt;
&lt;td&gt;Works universally across all types&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The Core Advantage&lt;/strong&gt;: The CapSolver extension operates within the browser Hermes is controlling. When the agent reaches a CAPTCHA, the extension detects it, contacts the CapSolver API, and solves it in the background. By the time the agent is ready to submit the form, the token is already there.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;All you need to do is provide time.&lt;/strong&gt; Instead of explaining CAPTCHAs to the agent, just say:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Navigate to the page, &lt;strong&gt;wait 60 seconds&lt;/strong&gt;, and then click Submit."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The agent remains completely unaware of the technical process happening behind the scenes.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To set up this integration, ensure you have:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Hermes Agent&lt;/strong&gt; installed with the gateway active (&lt;a href="https://github.com/NousResearch/hermes-agent#install" rel="noopener noreferrer"&gt;see installation guide&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A CapSolver account&lt;/strong&gt; and an API key (&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;register here&lt;/a&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chromium or Chrome for Testing&lt;/strong&gt; (see the note below regarding standard Chrome).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Important: Use Chromium, Not Branded Google Chrome
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;As of mid-2025, Google Chrome 137+ has disabled the &lt;code&gt;--load-extension&lt;/code&gt; flag in branded versions.&lt;/strong&gt; This means extensions cannot be loaded during automated sessions in standard Chrome or Edge.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You &lt;strong&gt;must&lt;/strong&gt; use one of the following instead:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Browser Choice&lt;/th&gt;
&lt;th&gt;Extension Support&lt;/th&gt;
&lt;th&gt;Recommended?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Google Chrome 137+&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Microsoft Edge&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chrome for Testing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chromium (standalone)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Playwright Chromium&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Yes&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;How to install Chrome for Testing:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Recommended: Install via Playwright&lt;/span&gt;
npx playwright &lt;span class="nb"&gt;install &lt;/span&gt;chromium

&lt;span class="c"&gt;# Note the path to the binary:&lt;/span&gt;
&lt;span class="c"&gt;# Linux: ~/.cache/ms-playwright/chromium-XXXX/chrome-linux64/chrome&lt;/span&gt;
&lt;span class="c"&gt;# macOS: ~/Library/Caches/ms-playwright/chromium-XXXX/chrome-mac/Chromium.app/Contents/MacOS/Chromium&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, download it directly from the &lt;a href="https://googlechromelabs.github.io/chrome-for-testing/" rel="noopener noreferrer"&gt;Chrome for Testing portal&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Setup
&lt;/h2&gt;

&lt;p&gt;This setup involves two main parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Running a Chrome process&lt;/strong&gt; with the CapSolver extension and CDP enabled (on port &lt;code&gt;9222&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Updating Hermes' &lt;code&gt;config.yaml&lt;/code&gt;&lt;/strong&gt; to connect to this existing browser.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 1: Download the CapSolver Extension
&lt;/h3&gt;

&lt;p&gt;Get the extension and extract it to a known directory:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Visit the &lt;a href="https://github.com/capsolver/capsolver-browser-extension/releases" rel="noopener noreferrer"&gt;CapSolver GitHub releases&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Download the latest Chrome extension zip file.&lt;/li&gt;
&lt;li&gt;Extract it:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/.hermes/capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v&lt;span class="k"&gt;*&lt;/span&gt;.zip &lt;span class="nt"&gt;-d&lt;/span&gt; ~/.hermes/capsolver-extension/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Confirm the &lt;code&gt;manifest.json&lt;/code&gt; file is present in that folder.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note on Paths&lt;/strong&gt;: Always use absolute paths for the &lt;code&gt;--load-extension&lt;/code&gt; flag to avoid issues with service worker registration in some Chromium builds.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 2: Configure Your API Key
&lt;/h3&gt;

&lt;p&gt;Update the extension's configuration file at &lt;code&gt;~/.hermes/capsolver-extension/assets/config.js&lt;/code&gt; with your key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;defaultConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// Insert your key here&lt;/span&gt;
  &lt;span class="na"&gt;useCapsolver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;enabledForRecaptcha&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;enabledForRecaptchaV3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// ... other settings&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your key is available on your &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;CapSolver dashboard&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Launch Chrome with Extension and CDP
&lt;/h3&gt;

&lt;p&gt;Start Chrome separately with these essential flags:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--remote-debugging-port=9222&lt;/code&gt;: Enables Hermes to connect.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--load-extension=...&lt;/code&gt;: Loads the CapSolver tool.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--user-data-dir=...&lt;/code&gt;: Keeps the agent's profile separate.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Option A: Manual Launch (for testing)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;/path/to/chrome-for-testing/chrome &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--remote-debugging-port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;9222 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--remote-debugging-address&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;127.0.0.1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--user-data-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/chrome-debug"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load-extension&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/capsolver-extension"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable-extensions-except&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/capsolver-extension"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-first-run&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-default-browser-check&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-sandbox&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Option B: Background Script (for continuous use)
&lt;/h4&gt;

&lt;p&gt;Create a script at &lt;code&gt;~/.hermes/chrome-debug.sh&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="nv"&gt;CHROME_BIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.cache/ms-playwright/chromium-1200/chrome-linux64/chrome"&lt;/span&gt;
&lt;span class="nv"&gt;EXT_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/capsolver-extension"&lt;/span&gt;
&lt;span class="nv"&gt;USER_DATA_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$HOME&lt;/span&gt;&lt;span class="s2"&gt;/.hermes/chrome-debug"&lt;/span&gt;

&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DISPLAY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;:99   &lt;span class="c"&gt;# Required for headless environments&lt;/span&gt;

&lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$CHROME_BIN&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--remote-debugging-port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;9222 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--remote-debugging-address&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;127.0.0.1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--user-data-dir&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$USER_DATA_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--load-extension&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXT_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable-extensions-except&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$EXT_DIR&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-first-run&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-default-browser-check&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--no-sandbox&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable-dev-shm-usage&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--disable-features&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;Translate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run it in the background using &lt;code&gt;nohup&lt;/code&gt; or manage it with a tool like &lt;strong&gt;systemd&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Configure Hermes to Use CDP
&lt;/h3&gt;

&lt;p&gt;Modify &lt;code&gt;~/.hermes/config.yaml&lt;/code&gt; to include the &lt;code&gt;cdp_url&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;browser&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;inactivity_timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;120&lt;/span&gt;
  &lt;span class="na"&gt;cdp_url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://127.0.0.1:9222&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells Hermes to route all browser actions through your pre-configured Chrome instance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Restart the Hermes Gateway
&lt;/h3&gt;

&lt;p&gt;Apply the changes by restarting Hermes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes gateway run
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 6: Verify the Integration
&lt;/h3&gt;

&lt;p&gt;Run the diagnostic tool:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;hermes doctor
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look for &lt;code&gt;browser-cdp&lt;/code&gt; under &lt;strong&gt;Tool Availability&lt;/strong&gt;. If it's there, your setup is active. You can also verify the CDP endpoint directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-s&lt;/span&gt; http://127.0.0.1:9222/json/version
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Troubleshooting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;browser-cdp&lt;/code&gt; is missing in &lt;code&gt;hermes doctor&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This usually indicates a configuration error in &lt;code&gt;config.yaml&lt;/code&gt;. Ensure &lt;code&gt;cdp_url&lt;/code&gt; is correctly nested under the &lt;code&gt;browser:&lt;/code&gt; section.&lt;/p&gt;

&lt;h3&gt;
  
  
  Extension fails to solve CAPTCHAs
&lt;/h3&gt;

&lt;p&gt;Check if you are using branded Google Chrome 137+, which ignores extension loading. Switch to Chrome for Testing or Chromium. Also, ensure your CapSolver balance is sufficient.&lt;/p&gt;

&lt;h3&gt;
  
  
  Browser timeouts on startup
&lt;/h3&gt;

&lt;p&gt;The first connection might take longer. If it fails, try the command again or increase the &lt;code&gt;inactivity_timeout&lt;/code&gt; in your configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chrome crashes after version updates
&lt;/h3&gt;

&lt;p&gt;If you change Chrome versions, the existing user data directory might be incompatible. Delete &lt;code&gt;~/.hermes/chrome-debug&lt;/code&gt; and restart Chrome to generate a fresh profile.&lt;/p&gt;




&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Allow Ample Time&lt;/strong&gt;: Set a wait time of &lt;strong&gt;30–60 seconds&lt;/strong&gt; to ensure the CAPTCHA has time to be solved and the token injected.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Natural Language&lt;/strong&gt;: Instruct the agent to "wait a minute before submitting" rather than using technical terms about CAPTCHAs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitor Credits&lt;/strong&gt;: Regularly check your &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;CapSolver dashboard&lt;/a&gt; to keep your balance topped up.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Isolate Browser Data&lt;/strong&gt;: Always use a dedicated &lt;code&gt;--user-data-dir&lt;/code&gt; to keep the agent's environment separate from your personal data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security First&lt;/strong&gt;: Ensure &lt;code&gt;--remote-debugging-address&lt;/code&gt; is set to &lt;code&gt;127.0.0.1&lt;/code&gt; to prevent unauthorized remote access to your browser.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headless Servers&lt;/strong&gt;: Use &lt;code&gt;Xvfb&lt;/code&gt; on Linux servers without a GUI to provide the necessary display context for extensions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Efficiency&lt;/strong&gt;: Since the extension handles the hard work, you can use more affordable models (like those from OpenRouter) for navigation and interaction tasks.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The combination of Hermes Agent and CapSolver offers a revolutionary, zero-code approach to handling CAPTCHAs. By following this guide, you can:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Launch a customized Chrome instance&lt;/strong&gt; with the CapSolver extension.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connect Hermes via CDP&lt;/strong&gt; with a simple configuration change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interact with your agent naturally&lt;/strong&gt;, letting the background processes handle security hurdles.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This setup transforms CAPTCHA solving into an &lt;strong&gt;invisible, automated process&lt;/strong&gt;, allowing your AI agent to operate without interruption.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Ready to enhance your agent?&lt;/strong&gt; &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=hermes"&gt;Sign up for CapSolver&lt;/a&gt; today and use the code &lt;strong&gt;&lt;code&gt;herme&lt;/code&gt;&lt;/strong&gt; for a special bonus on your first deposit!&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdv5tktya5mgqrj80bh6z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdv5tktya5mgqrj80bh6z.png" width="549" height="222"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Do I need to explain CapSolver to the agent?
&lt;/h3&gt;

&lt;p&gt;No. The extension works independently. Just give the agent enough time (e.g., "wait 60 seconds") to allow the solve to complete.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why is branded Chrome not working?
&lt;/h3&gt;

&lt;p&gt;Recent updates to Google Chrome (v137+) removed the ability to load extensions via command-line flags in automated sessions. Chrome for Testing or Chromium are the required alternatives.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I use cloud-based browsers?
&lt;/h3&gt;

&lt;p&gt;No, cloud providers like Browserbase don't allow for the custom extension loading required for this specific integration.&lt;/p&gt;

&lt;h3&gt;
  
  
  What CAPTCHA types are supported?
&lt;/h3&gt;

&lt;p&gt;The extension handles reCAPTCHA (v2/v3), hCaptcha, FunCaptcha, and AWS WAF CAPTCHA automatically. Note that Cloudflare Turnstile requires a different approach via the CapSolver API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Hermes Agent free?
&lt;/h3&gt;

&lt;p&gt;Yes, it is open-source. You only pay for the AI model usage (via providers like OpenRouter) and the CAPTCHA solving credits from CapSolver.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI-Driven Data Extraction: A Paradigm Shift from Rule-Based Parsing to Semantic Understanding</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Wed, 13 May 2026 08:45:06 +0000</pubDate>
      <link>https://dev.to/luisgustvo/ai-driven-data-extraction-a-paradigm-shift-from-rule-based-parsing-to-semantic-understanding-2l33</link>
      <guid>https://dev.to/luisgustvo/ai-driven-data-extraction-a-paradigm-shift-from-rule-based-parsing-to-semantic-understanding-2l33</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3n412r6hebbfi3zohj8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3n412r6hebbfi3zohj8.png" alt="cover" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction: Beyond Parsing, It's About Acquisition
&lt;/h2&gt;

&lt;p&gt;Traditional web data extraction methods, relying on mechanical matching techniques such as CSS selectors, XPath, and regular expressions, are inherently tied to fixed positions within the Document Object Model (DOM) tree to retrieve specific values. This approach has proven vulnerable to the dynamic nature of modern web development, frequently encountering issues with page redesigns, the widespread adoption of dynamic rendering, and sophisticated anti-scraping measures. Such vulnerabilities lead to significant maintenance overheads and an inability to process asynchronously loaded content.&lt;/p&gt;

&lt;p&gt;The advent of large language models (LLMs) marks a pivotal moment, transforming data extraction from a query of "where is the data located within the tags?" to an understanding of "what question does the page content answer?" This shift ushers in a new era driven by natural language comprehension. This is not merely a theoretical advancement; frameworks like AXE demonstrate practical superiority. By intelligently pruning irrelevant DOM nodes and integrating with smaller models for structured output generation, AXE has achieved an F1 score of 88.1% on the SWDE dataset, outperforming larger models. This validates the efficacy and efficiency of semantic extraction. This article will deconstruct the technical principles and critical trade-offs across the data flow sequence, from the data acquisition layer (addressing anti-crawling and CAPTCHAs) to the content processing layer (involving cleaning and LLM semantic extraction), culminating in the storage and consumption of structured data.&lt;/p&gt;

&lt;h2&gt;
  
  
  I. Paradigm Shift: From Rule-Based Parsing to Natural Language Processing
&lt;/h2&gt;

&lt;p&gt;Before delving into the technical intricacies of AI-powered data extraction, it is crucial to comprehend the limitations that the preceding paradigm faced and the dimensions in which the new paradigm offers significant breakthroughs.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.1 Three Dilemmas of the Rule-Based Parsing Era
&lt;/h3&gt;

&lt;p&gt;The cornerstone of conventional web data extraction has been "path positioning." Developers manually inspect the DOM node containing the target data using browser developer tools and then craft CSS selectors or XPath expressions to precisely locate that node. While this paradigm has served the majority of web data collection needs over the past decade, it suffers from three fundamental flaws that have been exacerbated by the evolution of web technology.&lt;/p&gt;

&lt;h4&gt;
  
  
  1.1.1 Fragile Anchors: Static Rules Struggle in a Dynamic Environment
&lt;/h4&gt;

&lt;p&gt;Modern websites typically undergo substantial DOM structure alterations every three to six months. Each redesign renders existing crawler rules, based on static paths, obsolete. For teams managing hundreds of target nodes concurrently, this translates into a relentless cycle of "whack-a-mole" maintenance. Figure 1-1 illustrates the comprehensive workflow of traditional crawlers when interacting with contemporary websites, highlighting the stages from request initiation to data extraction and the associated challenges:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faus377phgmi2uu30tclf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Faus377phgmi2uu30tclf.png" alt="Figure 1-1: Traditional Web Crawler Workflow and Dilemmas" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This process underscores the core issue of the first dilemma: the incompatibility between static parsing capabilities and dynamically rendered content. According to W3Techs statistics, by the end of 2025, an estimated X% of global websites will utilize anti-scraping services such as Cloudflare. Considering Netcraft’s concurrent detection of total websites, this impacts over 290 million sites, with the median JavaScript size of web pages exceeding 500KB. Traditional crawlers often retrieve only the unrendered skeleton, failing to "see" the data. Furthermore, a website redesign immediately invalidates meticulously written selectors. This combination of "technical incapacitation" and "maintenance fragility" continuously narrows the applicability of rule-based parsing.&lt;/p&gt;

&lt;h4&gt;
  
  
  1.1.2 Semantic Blindness: Syntactic Matching Fails to Grasp Meaning
&lt;/h4&gt;

&lt;p&gt;Traditional methods can only ascertain "the data is at this position," not "what does the data at this position represent?" On a single product listing page, there might be promotional prices, recommended prices, and actual product prices, all potentially sharing identical DOM tags, making differentiation impossible for traditional rules. When confronted with diverse date formats like “2026-04-28,” “April 28, 2026,” and “28/04/2026,” traditional parsers necessitate distinct regular expressions for each, struggling to adapt to dynamic format variations. Figure 1-2 employs a radar chart to visually compare traditional rule-based parsing with AI semantic extraction across six key dimensions:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbm0zci4vpsmlwtcbu07.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffbm0zci4vpsmlwtcbu07.png" alt="Figure 1-2: Six-Dimensional Capability Comparison of Traditional Rule-Based Parsing and AI Semantic Extraction" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The radar chart distinctly illustrates that traditional rule-based parsing's "working logic" dimension is solely dependent on precise DOM path positioning. However, its performance is severely constrained across the other five dimensions: its adaptability to structural changes is minimal, dynamic rendering processing relies entirely on external tools, data standardization requires manual regular expression crafting, maintenance costs escalate linearly with the number of sites, and its coverage is limited to one rule set per site. Five of the six axes are significantly underdeveloped, resulting in a "compressed" irregular polygon.&lt;/p&gt;

&lt;p&gt;Conversely, the radar chart for AI semantic extraction exhibits a more balanced and expansive profile. It automatically adapts to structural changes through semantic understanding, fully processes dynamic rendering using browser capabilities, achieves zero-rule standardization via LLM’s inherent format conversion abilities, experiences reduced maintenance costs as model capabilities improve, and allows a single Schema to cover similar pages across an entire site.&lt;/p&gt;

&lt;p&gt;Each of these six capability deficiencies is not an isolated technical hurdle but a direct consequence of the underlying "mechanical matching" logic. As long as data extraction operates at the syntactic level, no matter how ingeniously designed the rules, this structural limitation remains insurmountable. Therefore, a fundamental paradigm shift, rather than mere rule patching, is required to address these issues comprehensively.&lt;/p&gt;

&lt;h4&gt;
  
  
  1.1.3 The Inherent Ceiling: Why This Paradigm is Destined for Replacement
&lt;/h4&gt;

&lt;p&gt;All the challenges inherent in the rule-based parsing paradigm originate from its reliance on "mechanical matching" at the "syntactic level." This operational logic enables "precise positioning"—accurately identifying the DOM path of data—but at the cost of "passively adapting" to every page structure modification. A site redesign invalidates rules; heterogeneous data types necessitate new, manually written regular expressions. This reactive mode, dictated by the target website, constitutes an insurmountable "structural ceiling" for rule-based parsing. Figure 1-3 offers a comparative evolution, previewing the fundamental leap in this paradigm's direction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxd0d7i5aw2btydm5rbc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnxd0d7i5aw2btydm5rbc.png" alt="Figure 1-3: Paradigm Shift from Syntactic Matching to Semantic Understanding" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;As depicted, this represents not an incremental technical improvement but two fundamentally divergent approaches. The rule-based parsing paradigm, shown on the left, operates at the "syntactic level," aiming for "precise positioning." It passively adapts to structural changes and quickly encounters a "structural ceiling"—akin to knowing a passage is on page 3, line 5 of a book, without understanding its content. The semantic extraction paradigm, on the right, fundamentally alters the operational level: transitioning from "syntax" to "semantics," and from "mechanical matching" to "intelligent understanding." Its objective is no longer to locate node coordinates but to directly comprehend the page content itself, with its capabilities no longer dictated by DOM changes.&lt;/p&gt;

&lt;p&gt;This also clarifies why the three dilemmas of the rule-based parsing era are interconnected, representing different manifestations of the underlying "syntactic matching" logic. As long as data extraction technology remains at the syntactic level, no matter how elaborate the rule design, it cannot overcome the inherent paradox of coexisting "precise positioning" and "semantic blind spots." Consequently, the emergence of the AI semantic extraction paradigm is not an acceleration along an existing path but a cognitive revolution, moving from "finding positions" to "understanding content." The specific mechanisms and advantages of this paradigm shift will be further elaborated in Section 1.2.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 AI Paradigm: From Syntactic Matching to Semantic Understanding
&lt;/h3&gt;

&lt;p&gt;AI-driven methodologies fundamentally redefine problem-solving approaches. Figure 1-4 contrasts the core differences between rule-based parsing and AI semantic paradigms across four dimensions: core problem, dependent factors, adaptation to changes, and expansion mode:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8j3me0sf2ir40z49q6ox.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8j3me0sf2ir40z49q6ox.png" alt="Figure 1-4: Core Comparison of Rule-Based Parsing Paradigm and AI Semantic Paradigm" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Traditional methods inquire "where is the data within the DOM node?" whereas AI methods ask "what content on the page constitutes the user's primary interest?" This divergence in questioning dictates all subsequent technical trajectories. The former relies on the precision of DOM paths, rendering rules invalid and necessitating manual repair upon page redesigns or node shifts. The latter, however, depends on the consistency of page semantics. While DOM structures and data positions may change, the model can still accurately identify and extract content as long as the semantic meaning remains constant. In terms of scalability, rule-based parsing demands a new set of rules for each new site, whereas the AI semantic paradigm can apply a single Schema to cover similar pages across an entire site.&lt;/p&gt;

&lt;p&gt;This transition from "precise syntactic positioning" to "fuzzy semantic understanding" imbues AI methods with a robustness that traditional rules lack. The AXE framework, a notable academic contribution, provides a clear engineering illustration of this paradigm shift. Figure 1-5 summarizes its core processing flow:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf98xigl212tmnmox3wv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fcf98xigl212tmnmox3wv.png" alt="Figure 1-5: AXE Framework Core Processing Flow" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Figure 1-5 outlines a complete pipeline from raw HTML to structured output. AXE initially treats the HTML DOM as a tree requiring pruning, systematically removing irrelevant nodes such as navigation bars, footers, and boilerplate code through a specialized mechanism. The DOM is then compressed into high-density semantic blocks containing essential information. Finally, a lightweight, compact model processes these semantic blocks to generate structured JSON output. This entire process bypasses the DOM path positioning that traditional methods rely on, operating directly on the page’s semantic content.&lt;/p&gt;

&lt;p&gt;On the SWDE dataset, which encompasses 8 vertical domains and over 80 real websites, AXE achieved an F1 score of 88.1%, surpassing numerous larger models. This outcome highlights a counter-intuitive yet critical insight: semantic extraction capability is not solely dependent on massive models; a meticulously designed and specifically trained miniature model can achieve production-level accuracy. This serves as key evidence for the cost-effectiveness and engineering viability of the AI semantic paradigm.&lt;/p&gt;

&lt;p&gt;Another significant work, Dripper, adopts an alternative technical approach, reframing main content extraction as a "semantic block sequence classification" task. Figure 1-6 uses a card comparison to juxtapose the methodological differences between AXE and Dripper, alongside the resulting evolution of operational and maintenance modes from the rule-based era to the AI era:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2z7jq3n94a8pez9nsh3.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc2z7jq3n94a8pez9nsh3.png" alt="Figure 1-6: Comparison of AXE and Dripper Frameworks, and Evolution of Operation and Maintenance Modes in Rule Era vs. AI Era" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AXE employs the "DOM pruning + structured generation" pathway, condensing HTML DOM into high-density semantic blocks before directly outputting JSON via a compact model. Dripper, conversely, utilizes the "semantic block binary classification" route, transforming main content extraction into a classification task that determines whether each semantic block belongs to the main text. Both models, with a similar scale of 0.6B parameters, have demonstrated production-ready accuracy on their respective benchmarks. AXE achieved an F1 score of 88.1% on the SWDE dataset, while Dripper compressed input tokens to 22% of the original HTML and attained an 81.58% ROUGE-N F1 score on WebMainBench. These distinct approaches converge on the same conclusion: AI data extraction is competitive in accuracy and does not necessitate colossal models; a well-engineered miniature model can also be highly effective.&lt;/p&gt;

&lt;p&gt;The right side of the comparison reveals a deeper implication of this paradigm shift: it not only alters the technical approach but also reconfigures the daily operational practices of data teams. The primary activities in the rule-based era involved writing, fixing, and managing rules, essentially manual labor. The bottleneck for expansion was human capacity; adding a new target site invariably required engineers to create new rules. This is where the AI era fundamentally differs.&lt;/p&gt;

&lt;h2&gt;
  
  
  II. Core Process of AI Data Structured Extraction
&lt;/h2&gt;

&lt;p&gt;The complete AI data extraction pipeline comprises seven stages, logically grouped into three functional layers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Data Acquisition Layer&lt;/strong&gt; (URL Queue → Web Scraping → Anti-Scraping Detection): This layer is responsible for successfully retrieving the HTML of the target page within complex network environments. It represents the highest-risk zone of the entire pipeline, with a 14% core bottleneck, as indicated in Figure 2-2, directly attributable to this stage.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Content Processing Layer&lt;/strong&gt; (Content Cleaning → LLM Parsing → Schema Validation): This layer transforms noisy raw HTML into high-quality structured data. The accuracy bottleneck (18%) is predominantly concentrated within the content cleaning stage of this layer.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Data Storage Layer&lt;/strong&gt; (Data Storage): This final layer handles the output for downstream consumption, accounting for approximately 5% of the overall pipeline’s load.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This chapter will primarily focus on the technical details of Layer 2, the content processing layer, demonstrating how AI semantic extraction fundamentally surpasses traditional rule engines. Layer 1, which is a critical prerequisite for data to flow into the processing layer, will be thoroughly discussed with practical solutions in Chapter 3.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.1 AI Data Extraction Pipeline Overview
&lt;/h3&gt;

&lt;p&gt;Before delving into the specifics of the processing layer, it is beneficial to gain a comprehensive understanding of the entire pipeline through Figure 2-1. This overview illustrates the complete journey from URL queuing to data storage and the actual traffic distribution at each stage, serving as a foundational context for this chapter and for addressing bottlenecks in Chapter 3.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdh4cvvczc8iskxs39iw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdh4cvvczc8iskxs39iw.png" alt="Figure 2-1: AI Data Extraction Pipeline" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The URL queue acts as the entry point of the pipeline, managing the list of URLs to be crawled and regulating the request rhythm. As shown in Figure 2-1, approximately 32% of requests at the URL scheduling stage are pre-identified with CAPTCHA risks, while 68% can proceed directly with normal requests. The web scraping stage is responsible for initiating HTTP requests or orchestrating browser rendering to obtain the raw page content. At this juncture, 12% of requests are immediately intercepted by CAPTCHAs, while 80% successfully advance to subsequent stages.&lt;/p&gt;

&lt;p&gt;Following initial scraping, requests proceed to the anti-scraping detection stage. Modern anti-scraping systems concurrently analyze signals from four dimensions—IP reputation, TLS fingerprint, browser characteristics, and behavior patterns—performing multi-layered cross-validation. Figure 2-1 indicates that approximately 10% of traffic in the anti-scraping detection stage will be identified as automated requests and blocked, and 20% necessitates reliance on IP proxy pools and TLS fingerprint spoofing to bypass detection. This represents the most uncertain node in the entire pipeline. If a CAPTCHA is triggered and not effectively managed, the computing resources of all subsequent stages will remain idle.&lt;/p&gt;

&lt;p&gt;Upon successfully passing anti-scraping detection, raw HTML content is obtained. A typical news page’s raw HTML can exceed 2MB, translating to 300,000 to 500,000 tokens after processing with OpenAI’s tiktoken tokenizer. This content is often replete with navigation menus, embedded CSS, Base64 encoded tracking pixels, and compressed JavaScript. Consequently, content cleaning becomes an indispensable step. Figure 2-1 illustrates that HTML to Markdown conversion accounts for 50% of the effort in this stage, with DOM simplification and noise removal contributing another 30%. These two processes collectively compress the raw HTML into high-density semantic text, ensuring that the LLM’s computational power is focused on meaningful information rather than extraneous noise.&lt;/p&gt;

&lt;p&gt;The cleaned text then proceeds to the LLM parsing stage, where the model extracts structured fields from the text according to a predefined Schema. Figure 2-1 combines this stage with the subsequent Schema validation, showing an accuracy rate of 94.7%. This implies that approximately 1 in 20 extractions will fail to meet field completeness or format consistency checks. Successful outputs are transformed into structured JSON data, which is ultimately stored in systems like PostgreSQL or MongoDB for downstream business consumption.&lt;/p&gt;

&lt;p&gt;To provide a clearer breakdown of the technical enablers, performance indicators, and engineering bottlenecks at each stage, Figure 2-2 presents a panoramic view in the form of a dashboard:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbju1rxihrpcrh19qd4jc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbju1rxihrpcrh19qd4jc.png" alt="Figure 2-2: Breakdown of AI Data Extraction Pipeline Stages" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The performance indicators on the right side of the figure reveal the operational baselines for each stage: the priority scheduling achievement rate of the URL queue is 85%, indicating that about 15% of tasks experience delays or degradation due to scheduling conflicts. Web scraping achieves a 90% success rate under an 800ms latency constraint, clearly defining the limits of network and rendering resources. The anti-scraping mechanism boasts an accuracy rate of 94.7%, meaning approximately 5 out of every 100 requests are intercepted or trigger verification. After content cleaning, the Schema compliance rate is 88% and field completeness is 95%. These two metrics collectively establish the data quality baseline, with approximately 12% of pages exhibiting deviations in main content identification and 5% missing required fields.&lt;/p&gt;

&lt;p&gt;The bottom of Figure 2-2 directly pinpoints the bottleneck distribution: the core bottleneck lies in the anti-scraping mechanism (14%), the accuracy bottleneck in content cleaning (18%), capacity bottlenecks in URL scheduling and web scraping, and the cost bottleneck in the quality inspection overhead of Schema validation. These data strongly corroborate the preceding analysis. Anti-scraping detection acts as the “chokepoint” of the entire chain; if an anti-scraping strategy is triggered and cannot be effectively bypassed, the accuracy of subsequent stages becomes irrelevant due to a lack of input data. This mirrors the fundamental problem faced by traditional rule-based crawlers: in the era of AI semantic extraction, while the accuracy ceiling has significantly risen, the “entry qualification” for data acquisition remains the primary hurdle for engineering implementation. Consequently, Chapter 3 will specifically address the evolution of anti-scraping confrontation technology and countermeasures.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Content Cleaning: From Noisy HTML to LLM-Readable Text
&lt;/h3&gt;

&lt;p&gt;Directly feeding raw HTML to LLMs for structured extraction is highly inefficient from an engineering perspective. The LLM’s attention mechanism can be easily distracted by DOM boilerplate code, such as deeply nested &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; tags, embedded CSS styles, tracking scripts, navigation menus, and footer links. These elements not only provide zero semantic value but also drastically inflate token consumption. In large-scale scenarios processing thousands of pages daily, this waste quickly becomes financially unsustainable. The composition of a typical news page’s HTML intuitively demonstrates the severity of this problem. Figure 2-3 presents a circular chart illustrating the proportion of effective information relative to various noise elements in raw HTML:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkya20iyx4npmexstyyw6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkya20iyx4npmexstyyw6.png" alt="Figure 2-3: Composition of Raw HTML Content of a Typical News Page" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The circular chart delineates the raw HTML into four distinct areas. The green segment (45%) represents effective body content, including text and images—the crucial signal that the LLM truly requires. The yellow segment (20%) comprises structural and style noise, specifically &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;style&amp;gt;&lt;/code&gt;, and &lt;code&gt;&amp;lt;svg&amp;gt;&lt;/code&gt; tags. The blue segment (20%) consists of navigation and sidebars, while the red segment (15%) denotes advertisements and trackers. Collectively, the three noise components exceed 55%, implying that more than half of the tokens sent to the LLM are billed without contributing any semantic value.&lt;/p&gt;

&lt;p&gt;This reality of “signal drowned in noise” has necessitated a three-layered progressive cleaning strategy. Figure 2-4 illustrates the complete processing chain from raw HTML to LLM-readable text:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi09r0sfxpoo9svwmigx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqi09r0sfxpoo9svwmigx.png" alt="Figure 2-4: Layered Compression Effect of Cloudflare Official Documentation Page" width="800" height="343"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;From this perspective, it is evident that the three layers of cleaning compress tokens from 9,541 to 1,678, representing only 18% of the original HTML. This compression ratio translates to a reduction in API call costs to less than one-fifth of the original in large-scale processing. Furthermore, the 10–100 times context reduction achieved by semantic context filtering ensures that the LLM’s attention is focused on relevant signals rather than noise. This constitutes an indispensable component of the engineering implementation of AI data extraction.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 LLM Parsing and Schema Validation: From Text to Structured Data
&lt;/h3&gt;

&lt;p&gt;The Markdown text, meticulously cleaned through the content cleaning process, then enters the LLM parsing stage. The objective here is to generate structured JSON that strictly adheres to a predefined Schema. Depending on the specific scenario, three mainstream technical paths are currently available. Path one utilizes general large models like GPT-4o, which, with a 128K context window, offers the fastest inference speed and highest quality score. However, it comes at a moderate cost, making it suitable for rapid prototype verification with a limited number of fields and simple formats. Path two employs Schema-first specialized models such as Schematron-3B, deployed in a compact server-side environment. These models offer medium-high speed and a quality score only marginally behind general large models (by 0.12 points), while significantly reducing costs to the lowest tier, making them an optimal choice for large-scale production scenarios. Path three leverages multimodal language models to construct hybrid architectures, simultaneously parsing screenshots and HTML. This approach is capable of handling highly dynamic interactive pages, including infinite scrolling and modal pop-ups, but it comes with medium speed, the highest cost, and a relatively lower quality score. Despite these trade-offs, it is almost the only viable route for complex interactive scenarios. Regardless of the chosen path, the initially generated structured JSON must undergo three layers of Schema validation—field completeness, type compliance, and format consistency—before being output as the final data. Figure 2-5 illustrates the complete relationship between these three paths and Schema validation from both a process chain and core metrics perspective.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm96xrm2325c9jygvjemf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm96xrm2325c9jygvjemf.png" alt="Figure 2-5: Three Technical Paths of LLM Parsing and Schema Validation Process" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The matrix clearly reveals a counter-intuitive yet crucial engineering reality: the largest model is not always the optimal solution. Schematron-3B, with merely 3 billion parameters, achieves a quality score comparable to that of large models like GPT-4o while substantially reducing costs. When processing scales to one million pages per day, its inference cost is approximately 1/80th of that of large general models, marking a critical transition from “technically feasible” to “commercially profitable.” Although Webscraper+MLLM incurs the highest cost and has a relatively lower quality score, it remains almost the sole feasible option for highly dynamic interactive scenarios. This precisely confirms a fundamental principle: the correctness of technology selection is dictated by scenario constraints, not by absolute metric values.&lt;/p&gt;

&lt;p&gt;Schema validation serves as the final checkpoint to ensure data usability. Among these checks, format consistency is particularly vital for fields such as dates, currencies, and phone numbers. Traditional regular expression solutions demand manual rule creation for each input variant, whereas the LLM’s internalized format conversion capabilities enable standardization with zero rules. In terms of accuracy, the AXE framework has achieved an F1 score of 88.1% on the SWDE dataset. Experience in actual production environments suggests that pursuing 90% automated extraction accuracy combined with a rapid manual review path is a more pragmatic engineering strategy than rigidly aiming for 100% theoretical accuracy at dozens of times the cost. The optimal balance for this trade-off depends on each team’s specific assessment of “data continuity” and “budget ceiling,” but it is clear that moderate accuracy is often more commercially viable.&lt;/p&gt;

&lt;h2&gt;
  
  
  III. The Triple Gates of AI Data Extraction: Anti-Scraping, CAPTCHA Breakthrough, and Cost Control
&lt;/h2&gt;

&lt;p&gt;In Chapter 2, we thoroughly explored the technical chain of the content processing layer—from HTML cleaning to Schema validation—demonstrating how AI semantic extraction significantly raises the accuracy ceiling. However, as revealed in Figure 2-2 of Section 2.1, the core bottleneck (14%) of the entire pipeline is not within the processing layer, but in the preceding data acquisition layer. If the HTML cannot be obtained, all subsequent intelligent parsing is rendered moot. This chapter will directly address this critical stage that determines “entry qualification.”&lt;/p&gt;

&lt;h3&gt;
  
  
  3.1 Data Acquisition Layer: The Primary Bottleneck of the Pipeline
&lt;/h3&gt;

&lt;p&gt;If content cleaning and LLM parsing address the question of “how to process data,” the data acquisition layer tackles a more fundamental and challenging issue: “can the data be obtained?” In the journey from the URL queue to normal access, the anti-scraping system represents the most unpredictable variable in the entire pipeline.&lt;/p&gt;

&lt;p&gt;Modern anti-scraping systems have evolved into a four-layered defense-in-depth architecture, simultaneously analyzing each request across network, transport, browser, and behavior layers. Figure 3-1 visually expands this layered detection architecture.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tnqjnlv60e8wrgz3naf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5tnqjnlv60e8wrgz3naf.png" alt="Figure 3-1: Four-Layer Defense-in-Depth Architecture of Modern Anti-Scraping Systems" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Requests sequentially pass through four layers of filtering. The network layer scrutinizes static signals such as IP location, data center affiliation, and missing reverse DNS. The transport layer compares TLS fingerprints. The browser layer captures automation indicators like the &lt;code&gt;navigator.webdriver&lt;/code&gt; property in headless mode, Canvas fingerprints, and WebGL renderer information. The behavior layer analyzes human behavioral characteristics that are difficult to precisely simulate, including mouse trajectories, scrolling patterns, and click intervals. These four layers of signals are cross-validated to form a weighted score, making it challenging to bypass detection.&lt;/p&gt;

&lt;p&gt;When all passive detection methods cannot definitively determine the nature of the traffic, the system deploys a CAPTCHA, which serves as the final line of defense for anti-scraping systems. Modern CAPTCHAs are no longer simple distorted character recognition tasks but intelligent challenge systems based on risk scores. Table 3-1 compares the four mainstream CAPTCHA systems currently available.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA System&lt;/th&gt;
&lt;th&gt;Interaction Form&lt;/th&gt;
&lt;th&gt;Judgment Mechanism&lt;/th&gt;
&lt;th&gt;AI Decoding Capability/Features&lt;/th&gt;
&lt;th&gt;Threat to Crawlers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2&lt;/td&gt;
&lt;td&gt;Click checkbox / Image recognition&lt;/td&gt;
&lt;td&gt;User interaction + AI behavior scoring&lt;/td&gt;
&lt;td&gt;Accuracy 85%–100%&lt;/td&gt;
&lt;td&gt;High, but breakable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3&lt;/td&gt;
&lt;td&gt;Completely invisible, no visible challenge&lt;/td&gt;
&lt;td&gt;Background continuous behavior scoring&lt;/td&gt;
&lt;td&gt;Cannot be directly “broken,” relies on behavior simulation&lt;/td&gt;
&lt;td&gt;Extremely high, invisible scoring&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Turnstile&lt;/td&gt;
&lt;td&gt;Browser environment consistency check&lt;/td&gt;
&lt;td&gt;Non-interactive verification&lt;/td&gt;
&lt;td&gt;Verifies browser integrity&lt;/td&gt;
&lt;td&gt;High, alternative to reCAPTCHA&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS WAF CAPTCHA&lt;/td&gt;
&lt;td&gt;Risk-based, configurable challenges&lt;/td&gt;
&lt;td&gt;AWS integrated environment judgment&lt;/td&gt;
&lt;td&gt;Cloud environment specific&lt;/td&gt;
&lt;td&gt;Medium, specific ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;CAPTCHA is positioned at the very end of the entire defense chain. Once triggered and left unhandled, all subsequent content cleaning and LLM parsing stages become completely ineffective. This is the fundamental reason why the data acquisition layer is termed the “primary bottleneck of the pipeline”: the anti-scraping mechanism dictates whether data can flow into the system, and it is a variable profoundly influenced by the target website. In an era where AI semantic extraction has significantly enhanced data processing efficiency, the offensive and defensive dynamics on the acquisition side remain the critical factor for engineering success.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 Completing the Puzzle: Technical Paths for Modern CAPTCHA Breakthrough
&lt;/h3&gt;

&lt;p&gt;Within the four-layered anti-scraping defense-in-depth system, CAPTCHA presents the final and most formidable obstacle to automated resolution. CAPTCHA recognition solutions, exemplified by CapSolver, play a crucial “fuse-like” role in the entire pipeline. They are strategically embedded between “anti-scraping detection” and “normal access.” When a crawler encounters challenges such as reCAPTCHA v2/v3, Cloudflare Turnstile, or AWS WAF CAPTCHA, the recognition service swiftly processes the challenge and returns a valid Token within seconds, thereby restoring the data flow. Figure 3-2 uses CapSolver as an example to illustrate the intervention point and processing logic of such solutions:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxygmy6wu482lylcihl1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxygmy6wu482lylcihl1.png" alt="Figure 3-2: CapSolver Intervention Process in the Pipeline" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Figure 3-2 clearly depicts the operational mechanism of these solutions: if the scraping request is not flagged by the four-layered defense system as triggering a CAPTCHA, it proceeds directly to normal access. However, if a CAPTCHA challenge is triggered, the recognition service immediately intervenes, submitting the CAPTCHA type and parameters. The AI completes recognition in seconds and returns a valid Token, effectively re-establishing the data flow at the point of interruption. This approach does not replace existing components but functions as a protective fuse, preventing the entire system from failing when an anomaly occurs.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=structure-data-ai"&gt;CapSolver&lt;/a&gt; is a leading solution in this domain. Similar services, such as 2Captcha and Anti-Captcha, offer comparable capabilities, allowing developers to select the most suitable vendor based on latency requirements, supported CAPTCHA types, and pricing models. This integration fundamentally alters the reliability model of the data acquisition layer. Figure 3-3 uses CapSolver as a case study to quantify the changes in key indicators before and after introducing CAPTCHA recognition:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm72ut2c44de2ka14f6i5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm72ut2c44de2ka14f6i5.png" alt="Figure 3-3: Comparison of Data Acquisition Reliability Before and After Introducing CapSolver" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Without a CAPTCHA handling mechanism, the overall success rate typically fluctuates between 70%–90%. If the target site deploys CAPTCHA, there is a 10%–30% probability of data flow blockage. In an e-commerce price monitoring system scraping 5,000 product pages per hour, even with a basic 90% success rate, approximately 500 pages of data would be lost hourly. Such losses are sufficient to introduce significant biases in price trend analysis and create systemic blind spots in competitor strategies. However, with the introduction of a CAPTCHA recognition solution, the success rate dramatically increases to over 95%–99%, reducing missing pages to fewer than 50. The recognition success rate for reCAPTCHA v2/v3 exceeds 99% when parameters are correctly configured. The summary at the bottom of the card highlights these improvements: a 5%–29% increase in success rate and over a 90% reduction in missing pages. In large-scale scenarios, “continuity is business value” is not merely a slogan but an engineering reality validated by these metrics.&lt;/p&gt;

&lt;p&gt;AI benchmark testing platforms and LLM training data collection scenarios also confront this challenge. Researchers require continuous acquisition of diverse data, and websites hosting this data frequently employ reCAPTCHA to prevent automated access, creating a paradox where “AI research teams are hindered by the very technology they study.” CAPTCHA recognition services provide a programmatic means to address these challenges, ensuring uninterrupted data collection and comprehensive benchmark testing results.&lt;/p&gt;

&lt;p&gt;At the integration level, such solutions can seamlessly collaborate with browser automation frameworks, proxy network services, and low-code automation platforms. Developers simply submit the CAPTCHA type and parameters to the API, and the system returns a Token within seconds. Platforms like n8n offer dedicated nodes, enabling business personnel to configure CAPTCHA recognition directly within workflows without writing code. This allows developers to concentrate on business logic and Schema design, delegating anti-scraping confrontation to specialized tools.&lt;/p&gt;

&lt;p&gt;From an architectural standpoint, CAPTCHA recognition solutions do not replace any existing components but provide a crucial layer of “availability guarantee” for the entry point of the entire pipeline. When CAPTCHA recognition can be automatically completed in seconds, data acquisition transitions from “intermittent blind spots” to “continuous data supply,” which is a prerequisite for the stable operation of the entire AI data structured extraction chain.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Accuracy and Cost: The Ultimate Trade-off in Engineering Implementation
&lt;/h3&gt;

&lt;p&gt;When deploying AI data structured extraction into a production environment, the ultimate decision variable is often not merely “is the accuracy sufficient?” but rather “can the cost be sustained?” Token consumption lies at the heart of this challenge. A moderately complex product page, even after cleaning, may consume between 8,000 and 15,000 tokens. Based on current mainstream model API pricing, the cost per extraction typically ranges from $0.001 to $0.01. While almost negligible during the prototype stage, when extraction scales to millions of pages per day, monthly costs can escalate to tens of thousands of dollars. At this point, cost control transitions from an optimization goal to a fundamental requirement. Currently, the industry employs three parallel strategies to reduce costs. Figure 3-4 illustrates their positioning and synergistic relationship within the overall parsing chain:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0t210angyn5lcf5zppns.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0t210angyn5lcf5zppns.png" alt="Figure 3-4: Three Cost Control Paths and Tiered Processing Flow" width="800" height="446"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Before the cleaned Markdown enters the parsing stage, path one reduces tokens by 85%–90% through front-end DOM elimination and main content detection. Services like Firecrawl and Jina Reader encapsulate this functionality into an API, obviating the need for developers to build their own cleaning pipelines. Path two replaces general large models with task-specific models, such as Schematron-3B and AXE 0.6B, at the model layer. This approach maintains accuracy while compressing inference costs by 98% and accelerating processing by more than 10 times. Path three utilizes rules or lightweight models for structurally simple pages at the scheduling layer, reserving the full large model for parsing only complex pages. This strategy is particularly effective in scenarios like e-commerce category monitoring, where most pages within the same site exhibit highly consistent structures, and only a few anomalous pages necessitate full model intervention. These three paths are not mutually exclusive but can be synergistically combined: first, compress tokens; then, classify by complexity; and finally, process with a task-matching model. Figure 3-5 further quantifies these three strategies based on core principles, token reduction, representative solutions, and cost reduction magnitude, also incorporating three data quality checks:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fui5pzx6iw3ge56ktpm7f.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fui5pzx6iw3ge56ktpm7f.png" alt="Figure 3-5: Comparison of Three Cost Reduction Strategies and Three Data Quality Checks" width="800" height="453"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Preprocessing compression directly reduces input volume by stripping DOM noise, achieving a token reduction of 85%–90%, which corresponds to an 80%–90% cost saving. Specialized small models decrease the cost of single inference by reducing model size, with parameters shrinking from tens of billions to the 0.6B–3B range, resulting in approximately 98% savings in inference costs. Tiered processing optimizes overall efficiency by allocating computing resources differentially, with savings dependent on the proportion of simple pages. These three approaches—“sending less,” “computing less,” and “computing cleverly”—form a comprehensive cost reduction system spanning the input layer, model layer, and scheduling layer.&lt;/p&gt;

&lt;p&gt;The latter half of the discussion shifts to quality assurance. Data quality inspection, often overlooked, is an equally critical aspect of cost control. The expense of rectifying low-quality data that propagates into downstream business processes frequently far exceeds the investment in performing checks at the extraction stage. In a production environment, at least three automated checks should be implemented: field fill rate checks ensure that required fields in the Schema are not empty, flagging abnormal records for manual review rather than direct discarding; numerical range checks validate business rules, such as prices not being negative and inventory remaining within a reasonable range, rejecting entries that exceed predefined thresholds; format consistency checks standardize fields like dates, currencies, and phone numbers, with regular expressions and the LLM’s internalized format conversion capabilities complementing each other, automatically processing convertible formats and marking non-convertible ones for manual intervention. These three checks maintain a dynamic balance between cost and quality, diverting abnormal records rather than discarding them, thereby ensuring completeness while preventing data blind spots.&lt;/p&gt;

&lt;p&gt;This balanced strategy is also applicable on a broader scale. In practical engineering, pursuing 90% automated extraction accuracy combined with a formalized manual review process is often more commercially viable than striving for 100% theoretical accuracy at a significantly higher implementation cost. The selection of target data storage also depends on downstream usage: for real-time API queries and front-end display, PostgreSQL or MongoDB are suitable choices; for full-text search and log analysis, Elasticsearch is a better match; and for use as an LLM training corpus, structured JSON typically needs to be re-serialized into the format required by the training framework and stored in object storage. The objective is not to pursue a “one-size-fits-all” storage solution but to align the most appropriate engine with data consumption methods and query patterns. This principle underpins all engineering decisions, from token cost to storage selection.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Redeem Your CapSolver Bonus Code&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Boost your automation budget instantly!&lt;br&gt;
Use bonus code &lt;strong&gt;CAP26&lt;/strong&gt; when topping up your CapSolver account to get an extra &lt;strong&gt;5% bonus&lt;/strong&gt; on every recharge — with no limits.&lt;br&gt;
Redeem it now in your &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=structure-data-ai"&gt;CapSolver Dashboard&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwbyb2y2w7ghdae44clg4.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwbyb2y2w7ghdae44clg4.png" alt="Bonus Code" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;From raw HTML to structured JSON, the complete chain of AI data extraction can be summarized into five sequential stages: acquisition, cleaning, parsing, validation, and storage. Each stage addresses a specific problem, and the effectiveness of each stage is contingent upon the successful completion of the preceding one.&lt;/p&gt;

&lt;p&gt;Within this chain, the data acquisition layer functions as the “entry point,” determining whether the entire pipeline operates normally or remains completely idle. The four-layered defense-in-depth of modern anti-scraping systems and continuously upgraded CAPTCHA mechanisms render data acquisition the most uncontrollable and highest-risk stage in the entire chain. While content cleaning can compress HTML by over 80%, specialized small models can perform accurate structured extraction in seconds, and Schema validation can ensure the compliance of output formats, the question of “whether data can be stably obtained” becomes the primary determinant of project success.&lt;/p&gt;

&lt;p&gt;This is precisely where &lt;a href="https://www.capsolver.com/blog/about-capsolver" rel="noopener noreferrer"&gt;CapSolver’s infrastructure-level value&lt;/a&gt; lies within the AI data extraction technology stack. It does not replace any stage in cleaning, parsing, or validation but provides a layer of continuous availability guarantee at the pipeline’s entry point. When CAPTCHA recognition can be automatically completed in seconds, with a success rate consistently above 99%, data acquisition transitions from intermittent interruptions to continuous output. This ensures that the computing resources and engineering investment of all subsequent stages yield meaningful returns. For businesses reliant on a stable data supply, the continuity of the pipeline itself represents business value, and ensuring this continuity is the final hurdle that AI data extraction must overcome in its journey from experimental concept to large-scale deployment.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>data</category>
    </item>
    <item>
      <title>Efficient Price Monitoring on AWS WAF-Protected Sites with n8n and CapSolver</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Thu, 30 Apr 2026 07:53:14 +0000</pubDate>
      <link>https://dev.to/luisgustvo/efficient-price-monitoring-on-aws-waf-protected-sites-with-n8n-and-capsolver-3m3l</link>
      <guid>https://dev.to/luisgustvo/efficient-price-monitoring-on-aws-waf-protected-sites-with-n8n-and-capsolver-3m3l</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixjbe9ydzf8f83rmqob8.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fixjbe9ydzf8f83rmqob8.png" alt="n8n CapSolver AWS WAF price monitoring tutorial cover" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In today's data-driven landscape, monitoring product prices is crucial for various business intelligence activities, including market research, competitive analysis, and identifying lucrative deals. However, a significant hurdle arises when target websites employ advanced security measures like AWS Web Application Firewall (WAF) to prevent automated access. AWS WAF, as detailed in its official documentation, acts as a protective layer, filtering HTTP and HTTPS requests to safeguard web applications [1]. This often means that standard HTTP requests from automation tools are blocked before they can even access the desired product information.&lt;/p&gt;

&lt;p&gt;CapSolver offers an elegant solution to this challenge with its n8n workflow template: "Monitor AWS WAF-protected product prices with CapSolver, schedule, and webhook." This template builds upon the foundation of solving AWS WAF challenges, as previously outlined in "How to Solve AWS WAF in n8n with CapSolver" [2], and extends it into a practical, reusable monitoring system. The workflow is designed to automatically solve AWS WAF, retrieve the protected product page, extract relevant product details, compare the latest price against historical data, and issue alerts only when a change is detected.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The template streamlines the monitoring process: it triggers, bypasses AWS WAF, fetches the product page, extracts data, compares it with previous results, and alerts exclusively upon detecting a change.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://n8n.io/workflows/14516-monitor-aws-waf-protected-product-prices-with-capsolver-schedule-and-webhook/" rel="noopener noreferrer"&gt;Access the n8n Workflow Template Here&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuuwkdktvf8pn36yy5kvb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fuuwkdktvf8pn36yy5kvb.png" alt="AWS WAF monitor price n8n template " width="800" height="413"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Challenge of AWS WAF in Price Monitoring
&lt;/h2&gt;

&lt;p&gt;AWS WAF often presents a more complex barrier than traditional CAPTCHA systems. Instead of visible challenges like checkboxes or image puzzles, it frequently relies on invisible, cookie-based verification. This means that an automated workflow must first acquire a valid &lt;code&gt;aws-waf-token&lt;/code&gt; cookie and then include this cookie in the &lt;code&gt;Cookie&lt;/code&gt; HTTP header when making subsequent requests to the protected page. For those new to this integration pattern, the CapSolver n8n CAPTCHA solver integration provides valuable context on how CapSolver integrates with n8n workflows [3].&lt;/p&gt;

&lt;p&gt;For effective price monitoring, understanding this mechanism is critical. A simple GET request to a product page will likely result in a WAF challenge page rather than the actual product HTML. To reliably extract pricing information, the automation must first successfully navigate the AWS WAF challenge and then utilize the obtained cookie for the target page request.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Challenge&lt;/th&gt;
&lt;th&gt;Impact on Price Monitoring&lt;/th&gt;
&lt;th&gt;CapSolver + n8n Solution&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Invisible AWS WAF challenge&lt;/td&gt;
&lt;td&gt;Direct HTTP requests may not return the product page.&lt;/td&gt;
&lt;td&gt;The CapSolver AWS WAF node resolves the challenge before fetching the page.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cookie-based access&lt;/td&gt;
&lt;td&gt;AWS WAF uses an &lt;code&gt;aws-waf-token&lt;/code&gt; cookie, not a form token.&lt;/td&gt;
&lt;td&gt;The workflow transmits the solved cookie via the &lt;code&gt;Cookie&lt;/code&gt; HTTP header.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need for repeated checks&lt;/td&gt;
&lt;td&gt;Price tracking requires continuous, scheduled monitoring.&lt;/td&gt;
&lt;td&gt;The template incorporates a scheduled trigger for regular checks (e.g., every six hours).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;On-demand monitoring&lt;/td&gt;
&lt;td&gt;Teams may need to initiate price checks from other applications.&lt;/td&gt;
&lt;td&gt;The template also supports webhook-based execution for immediate checks.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Change detection&lt;/td&gt;
&lt;td&gt;Raw scraping data is insufficient; users need to know what has changed.&lt;/td&gt;
&lt;td&gt;The workflow compares current and previous values to generate alerts only when changes occur.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Deconstructing the CapSolver n8n Template
&lt;/h2&gt;

&lt;p&gt;The CapSolver template, available in the n8n workflow library under the Market Research category, is a comprehensive solution developed by CapSolver. It seamlessly integrates scheduling, webhook execution, AWS WAF solving, HTML data extraction, stateful comparison, and conditional alert generation into a single, customizable workflow. This design aligns perfectly with n8n's philosophy of connecting nodes to automate processes, as described in the official n8n workflows documentation [4].&lt;/p&gt;

&lt;p&gt;At its core, the workflow initiates either at predefined intervals or in response to a webhook request. It then leverages CapSolver to overcome the AWS WAF challenge, proceeds to retrieve the protected product page, extracts the product price and name from the HTML content, compares these new values against data from the previous execution, and finally, logs or returns the result based on the trigger mechanism. For broader web scraping applications utilizing a no-code automation approach, "How to Build Scrapers for Web Scraping in n8n with CapSolver" offers further insights [5].&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workflow Stage&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Key n8n Nodes or Concepts&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Trigger&lt;/td&gt;
&lt;td&gt;Initiates monitoring automatically or on demand.&lt;/td&gt;
&lt;td&gt;Schedule Trigger and Webhook&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Solve AWS WAF&lt;/td&gt;
&lt;td&gt;Obtains the necessary AWS WAF cookie for page access.&lt;/td&gt;
&lt;td&gt;CapSolver AWS WAF node&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fetch Product Page&lt;/td&gt;
&lt;td&gt;Requests the protected page using the acquired cookie.&lt;/td&gt;
&lt;td&gt;HTTP Request&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extract Product Data&lt;/td&gt;
&lt;td&gt;Parses price and product name from the HTML.&lt;/td&gt;
&lt;td&gt;HTML extraction with CSS selectors&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Compare Data&lt;/td&gt;
&lt;td&gt;Determines if the latest price differs from the stored previous value.&lt;/td&gt;
&lt;td&gt;Code and workflow static data&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route Result&lt;/td&gt;
&lt;td&gt;Decides whether to generate an alert or log no change.&lt;/td&gt;
&lt;td&gt;If and Edit Fields / Set&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Respond&lt;/td&gt;
&lt;td&gt;Provides structured results for webhook-triggered executions.&lt;/td&gt;
&lt;td&gt;Respond to Webhook&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Flexible Execution: Schedule and Webhook Triggers
&lt;/h2&gt;

&lt;p&gt;The template's utility is significantly enhanced by its support for both scheduled monitoring and on-demand, webhook-based execution. The scheduled path is ideal for continuous price tracking, allowing for regular checks without manual intervention. For instance, the template's setup instructions guide users on configuring an "Every 6 Hours" node, ensuring consistent monitoring.&lt;/p&gt;

&lt;p&gt;Conversely, the webhook path proves invaluable when an internal tool, dashboard, bot, or backend system needs to trigger an immediate price check. As explained in the official n8n Webhook node documentation, webhooks can receive data from various applications, initiate a workflow, and return the generated output, making them perfect for API-like price verification [6].&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Trigger Type&lt;/th&gt;
&lt;th&gt;Primary Use Case&lt;/th&gt;
&lt;th&gt;Illustrative Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scheduled trigger&lt;/td&gt;
&lt;td&gt;Continuous market research and deal monitoring.&lt;/td&gt;
&lt;td&gt;Automatically check a competitor's product page every six hours and send an alert if the price changes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Webhook trigger&lt;/td&gt;
&lt;td&gt;On-demand automation and system integrations.&lt;/td&gt;
&lt;td&gt;Allow an internal dashboard to fetch the latest protected product price when a user clicks a "Refresh" button.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Understanding the AWS WAF Solving Process
&lt;/h2&gt;

&lt;p&gt;In most AWS WAF workflows, the primary input required is the &lt;code&gt;websiteURL&lt;/code&gt;. Unlike reCAPTCHA or Turnstile, AWS WAF typically does not necessitate a visible &lt;code&gt;websiteKey&lt;/code&gt; or site key. CapSolver efficiently handles the underlying challenge and provides a solution that can then be utilized to request the protected page. For a detailed guide on setting up credentials before using the template, refer to "How to Setup CapSolver on n8n" [7].&lt;/p&gt;

&lt;p&gt;The crucial implementation detail lies in how the solution is submitted. For AWS WAF, the solution is generally not placed into a form field. Instead, it is transmitted as an &lt;code&gt;aws-waf-token&lt;/code&gt; cookie within the &lt;code&gt;Cookie&lt;/code&gt; request header. The fundamental pattern is straightforward: solve the challenge, submit the cookie to the target website, validate the response, and then process the protected data.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter or Output&lt;/th&gt;
&lt;th&gt;Role in the Workflow&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;websiteURL&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The URL of the target page protected by AWS WAF.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;solution.cookie&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The resolved AWS WAF cookie provided by CapSolver.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Cookie&lt;/code&gt; header&lt;/td&gt;
&lt;td&gt;The appropriate HTTP header for submitting the solved AWS WAF token.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optional AWS WAF parameters&lt;/td&gt;
&lt;td&gt;Values such as &lt;code&gt;awsKey&lt;/code&gt;, &lt;code&gt;awsIv&lt;/code&gt;, &lt;code&gt;awsContext&lt;/code&gt;, or &lt;code&gt;awsChallengeJS&lt;/code&gt; can enhance solve reliability for specific sites.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Extracting Product Prices from Protected Pages
&lt;/h2&gt;

&lt;p&gt;Once the workflow successfully retrieves the protected page, the next step involves extracting specific product information from its HTML content. The reference implementation of this workflow is configured to look for common price and title selectors, such as &lt;code&gt;.product-price&lt;/code&gt;, &lt;code&gt;[data-price]&lt;/code&gt;, &lt;code&gt;.price&lt;/code&gt;, &lt;code&gt;h1&lt;/code&gt;, and &lt;code&gt;.product-title&lt;/code&gt;. This approach is consistent with the official n8n HTML node documentation, which explains its capability to extract content using keys, CSS selectors, and return value settings [8].&lt;/p&gt;

&lt;p&gt;This design makes the workflow highly adaptable. If your target website utilizes a different HTML structure, you can easily update the CSS selectors within the extraction node. For example, one e-commerce site might use &lt;code&gt;.sale-price&lt;/code&gt; for prices, while another might employ &lt;code&gt;[data-testid="price"]&lt;/code&gt;. The MDN CSS selectors guide provides comprehensive information on how selectors target HTML elements by type, attributes, state, and DOM position, underscoring the importance of choosing stable selectors for reliable data extraction [9].&lt;/p&gt;

&lt;h2&gt;
  
  
  Detecting Price Changes with Persistent Workflow Data
&lt;/h2&gt;

&lt;p&gt;For a price tracker to be truly effective, it must retain historical data to compare against current readings. This workflow utilizes n8n's persistent workflow state to compare the newly fetched price with the last stored price. In the reference workflow, the &lt;code&gt;$workflow.staticData.lastPrice&lt;/code&gt; variable ensures that the previous value is preserved across executions, enabling the system to determine if a price change has occurred.&lt;/p&gt;

&lt;p&gt;This mechanism allows the workflow to differentiate between a first check (no prior data), an unchanged price, a price drop, and a price increase. A significant price drop can be flagged with a higher "deal" severity, while an increase might be categorized as informational for market tracking purposes.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;th&gt;Interpretation&lt;/th&gt;
&lt;th&gt;Potential Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;First check&lt;/td&gt;
&lt;td&gt;No historical price data available.&lt;/td&gt;
&lt;td&gt;Store the current price and establish a baseline.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unchanged&lt;/td&gt;
&lt;td&gt;Current and previous prices are identical.&lt;/td&gt;
&lt;td&gt;Log "no change" to prevent unnecessary alerts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Price dropped&lt;/td&gt;
&lt;td&gt;Current price is lower than the previous price.&lt;/td&gt;
&lt;td&gt;Trigger a high-priority deal alert.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Price increased&lt;/td&gt;
&lt;td&gt;Current price is higher than the previous price.&lt;/td&gt;
&lt;td&gt;Send an informational alert for market analysis.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Setup Checklist
&lt;/h2&gt;

&lt;p&gt;Before deploying this template, you will need an active n8n instance and a CapSolver account. CapSolver is available as an n8n integration, allowing users to create and reuse a CapSolver API credential across multiple workflows.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Exclusive Offer: Use code &lt;code&gt;DEVTO24&lt;/code&gt; when signing up at &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=devto&amp;amp;utm_medium=article&amp;amp;utm_campaign=n8n-waf-monitor" rel="noopener noreferrer"&gt;CapSolver&lt;/a&gt; to receive bonus credits!&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwceli7yd16lhnijxdb6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiwceli7yd16lhnijxdb6.png" alt="Bonus Code" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Configuration Detail&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Add CapSolver credentials in n8n&lt;/td&gt;
&lt;td&gt;Create a CapSolver API credential and input your API key.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Configure the schedule&lt;/td&gt;
&lt;td&gt;Adjust the "Every 6 Hours" node to your desired monitoring interval.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Set the target product URL&lt;/td&gt;
&lt;td&gt;Replace the placeholder product page URL in the "Fetch Product Page" nodes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Verify extraction selectors&lt;/td&gt;
&lt;td&gt;Update CSS selectors for price and product name based on the target page's HTML structure.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Configure the webhook&lt;/td&gt;
&lt;td&gt;Set up the "Receive Monitor Request" node if on-demand checks are required.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Test the workflow&lt;/td&gt;
&lt;td&gt;Confirm that the AWS WAF cookie is accepted and extracted prices are accurate.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Customization and Expansion Opportunities
&lt;/h2&gt;

&lt;p&gt;The default workflow focuses on extracting product price and name, but its underlying pattern is highly extensible for broader market research needs. You can easily expand its capabilities to extract additional data points such as availability, discount labels, stock status, shipping information, seller names, review counts, or promotional badges. After extraction, n8n's versatility allows you to route the results to various destinations, including spreadsheets, databases, Slack channels, Telegram bots, email notifications, or internal dashboards. For scenarios involving AI-assisted scraping on protected sites, "How to Scrape CAPTCHA-Protected Sites with n8n, CapSolver, and OpenClaw" serves as a valuable follow-up read [10].&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Customization&lt;/th&gt;
&lt;th&gt;Implementation Approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Track multiple fields&lt;/td&gt;
&lt;td&gt;Add more CSS selectors within the HTML extraction step.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Monitor multiple products&lt;/td&gt;
&lt;td&gt;Duplicate the workflow path, utilize a list of URLs, or trigger the workflow with diverse webhook payloads.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Send alerts to team tools&lt;/td&gt;
&lt;td&gt;Integrate Slack, Telegram, Discord, email, or database nodes after the change-detection branch.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Store historical data&lt;/td&gt;
&lt;td&gt;Save each check to Google Sheets, Airtable, Postgres, MySQL, or other storage nodes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Use optional AWS WAF parameters&lt;/td&gt;
&lt;td&gt;Incorporate parameters like &lt;code&gt;awsContext&lt;/code&gt; or &lt;code&gt;awsChallengeJS&lt;/code&gt; if the target site demands more specific context.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Best Practices for Robust AWS WAF Price Monitoring
&lt;/h2&gt;

&lt;p&gt;To ensure reliable monitoring, begin by testing with a single product page to confirm that the workflow can successfully retrieve the actual product HTML after bypassing AWS WAF. If a challenge page is still returned, verify that the solved cookie is correctly sent in the &lt;code&gt;Cookie&lt;/code&gt; header and that it is used immediately after solving, as challenge cookies can have short expiration times.&lt;/p&gt;

&lt;p&gt;Furthermore, choose CSS selectors that are specific enough to accurately target data but not so fragile that minor page layout changes break the extraction. A general selector like &lt;code&gt;.price&lt;/code&gt; might work on many pages, but a more precise selector can reduce false positives if the page contains multiple price-like elements. For critical product monitoring, it's advisable to store both the raw extracted value and its parsed numeric equivalent, enabling thorough auditing of price changes over time.&lt;/p&gt;

&lt;p&gt;Finally, always treat this workflow as part of a compliant market research process. Only monitor pages you are authorized to access, and adhere to all relevant terms of service and legal guidelines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The "Monitor AWS WAF-protected product prices with CapSolver, schedule, and webhook" n8n template offers a robust starting point for e-commerce price monitoring and market research on websites secured by AWS WAF. It effectively combines CapSolver's advanced AWS WAF solving capabilities with n8n's intuitive visual automation features. This synergy empowers teams to fetch protected product pages, extract critical pricing data, track changes over time, and trigger timely alerts, all without the need to develop a complex scraper from scratch.&lt;/p&gt;

&lt;p&gt;For workflows requiring the monitoring of protected product pages, this template provides all the essential components: scheduled checks, webhook execution, AWS WAF resolution, cookie-based page retrieval, HTML data extraction, persistent data comparison, and structured alerting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the CapSolver n8n price monitoring template?
&lt;/h3&gt;

&lt;p&gt;This is an n8n workflow template developed by CapSolver designed to monitor product prices on websites protected by AWS WAF. It automates the process of solving AWS WAF challenges, fetching product pages, extracting data, comparing current values against previous ones, and sending alerts when changes are detected.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can this workflow operate autonomously?
&lt;/h3&gt;

&lt;p&gt;Yes, the template is configured for automatic operation. It includes a scheduled trigger, with initial instructions suggesting an "Every 6 Hours" interval, which can be customized to suit specific monitoring frequencies.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is it possible to trigger the workflow on demand?
&lt;/h3&gt;

&lt;p&gt;Absolutely. The template supports webhook execution, allowing external applications, dashboards, or services to initiate a product price check and receive the results instantly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does AWS WAF typically require a site key?
&lt;/h3&gt;

&lt;p&gt;In most instances, AWS WAF does not require a public site key. The &lt;code&gt;websiteURL&lt;/code&gt; is generally the primary parameter, though optional parameters may be used for specific or complex implementations.&lt;/p&gt;

&lt;h3&gt;
  
  
  How should the AWS WAF token be submitted?
&lt;/h3&gt;

&lt;p&gt;The resolved AWS WAF token should be submitted as a cookie within the &lt;code&gt;Cookie&lt;/code&gt; HTTP header, rather than as a field in a form submission.&lt;/p&gt;

&lt;h3&gt;
  
  
  What are the essential customizations before using the template?
&lt;/h3&gt;

&lt;p&gt;Key customizations include configuring your CapSolver API credentials, adjusting the monitoring schedule, updating the target product URL, refining the CSS selectors for price and product name extraction, and setting up the webhook if on-demand checks are desired.&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt; &lt;a href="https://docs.aws.amazon.com/waf/latest/developerguide/waf-chapter.html" rel="noopener noreferrer"&gt;AWS WAF Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://www.capsolver.com/blog/n8n/how-to-solve-aws-waf-captcha-n8n" rel="noopener noreferrer"&gt;How to Solve AWS WAF in n8n with CapSolver&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://www.capsolver.com/integration/n8n-captcha-solver" rel="noopener noreferrer"&gt;CapSolver n8n CAPTCHA solver integration&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://docs.n8n.io/workflows" rel="noopener noreferrer"&gt;n8n Workflows Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://www.capsolver.com/blog/n8n/how-to-build-scrapers-for-in-n8n-with-capsolver" rel="noopener noreferrer"&gt;How to Build Scrapers for Web Scraping in n8n with CapSolver&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.html#webhook" rel="noopener noreferrer"&gt;n8n Webhook Node Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://www.capsolver.com/blog/n8n/how-to-setup-capsolver-on-n8n" rel="noopener noreferrer"&gt;How to Setup CapSolver on n8n&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://docs.n8n.io/integrations/builtin/core-nodes/n8n-nodes-base.html" rel="noopener noreferrer"&gt;n8n HTML Node Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; &lt;a href="https://developer.mozilla.org/en-US/docs/Web/CSS/CSS_selectors" rel="noopener noreferrer"&gt;MDN CSS Selectors Guide&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/blog/AI/how-to-scrape-captcha-protected-sites-n8n-capsolver-openclaw" rel="noopener noreferrer"&gt;How to Scrape CAPTCHA-Protected Sites with n8n, CapSolver, and OpenClaw&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>n8n</category>
      <category>ai</category>
    </item>
    <item>
      <title>Best AI for Solving Image Puzzles: Top Tools and Strategies for 2026</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Wed, 22 Apr 2026 08:34:56 +0000</pubDate>
      <link>https://dev.to/luisgustvo/best-ai-for-solving-image-puzzles-top-tools-and-strategies-for-2026-3k21</link>
      <guid>https://dev.to/luisgustvo/best-ai-for-solving-image-puzzles-top-tools-and-strategies-for-2026-3k21</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnrvmxzob6e97lze32xo.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbnrvmxzob6e97lze32xo.jpeg" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Executive Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;  The most effective AI solutions for image puzzles integrate advanced computer vision with machine learning to automate complex visual challenges, including sliders, rotations, and object identification.&lt;/li&gt;
&lt;li&gt;  CapSolver emerges as a leading platform, providing specialized APIs such as the Vision Engine and ImageToTextTask, which offer immediate resolution of visual puzzles without the need for continuous polling.&lt;/li&gt;
&lt;li&gt;  The global computer vision market is experiencing significant expansion, with projections indicating a valuation of $58.29 billion by 2030, highlighting the increasing reliance on AI for sophisticated image recognition tasks.&lt;/li&gt;
&lt;li&gt;  Seamless integration of advanced AI for image puzzle solving with automation platforms like n8n enhances workflow efficiency and optimizes data extraction processes.&lt;/li&gt;
&lt;li&gt;  Adherence to ethical guidelines and compliance in the deployment of AI tools is crucial for ensuring sustainable and secure automated operations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In today's digital landscape, identifying the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; is paramount for developers, data analysts, and automation enthusiasts who frequently encounter complex visual challenges online. Traditional automation techniques often prove inadequate when faced with tasks such as slider puzzles, intricate image rotation challenges, or precise object selection grids. A robust AI solution not only significantly reduces processing time but also guarantees high levels of accuracy and dependability within automated workflows. This article delves into the premier tools currently available, with a particular emphasis on &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=best-ai-image-puzzels"&gt;CapSolver&lt;/a&gt;'s advanced capabilities. Whether your objective is to automate data collection or to construct sophisticated web scrapers, leveraging the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; will undoubtedly enhance the success and efficiency of your projects.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Evolution of Visual Puzzles and AI Solutions
&lt;/h2&gt;

&lt;p&gt;Visual puzzles have undergone a significant transformation, evolving from rudimentary distorted text challenges to highly sophisticated interactive tasks. Contemporary online environments frequently present users with slider puzzles, image rotation assignments, and object selection grids that demand precise spatial awareness and advanced pattern recognition capabilities. As these visual challenges grow in complexity, the technological solutions designed to address them must similarly advance.&lt;/p&gt;

&lt;p&gt;The most effective AI systems for solving image puzzles harness the power of Convolutional Neural Networks (CNNs) and sophisticated machine learning algorithms. These advanced systems meticulously analyze pixel data within images, discerning critical features such as edges, shapes, and spatial relationships. Industry analyses indicate that the &lt;a href="https://www.grandviewresearch.com/industry-analysis/computer-vision-market" rel="noopener noreferrer"&gt;computer vision market is projected to expand at a Compound Annual Growth Rate (CAGR) of 19.8%, reaching an estimated $58.29 billion by 2030&lt;/a&gt; [1]. This substantial growth underscores the increasing demand for robust AI solutions capable of processing and interpreting complex visual data.&lt;/p&gt;

&lt;p&gt;In contrast to generic Optical Character Recognition (OCR) tools, which primarily focus on text extraction, advanced AI for image puzzle solving demonstrates a profound understanding of contextual information. For instance, such AI can accurately compute the exact distance a puzzle piece needs to traverse or the precise rotational angle required to align an image correctly. This level of granular precision distinguishes basic automation from the sophisticated, AI-driven solutions that define the cutting edge of visual puzzle resolution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why CapSolver Excels in Image Puzzle Resolution
&lt;/h2&gt;

&lt;p&gt;When evaluating the optimal AI solutions for image puzzle resolution, CapSolver consistently emerges as a prominent leader. The platform delivers highly specialized APIs meticulously engineered for visual recognition tasks, providing unparalleled speed and accuracy in its operations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Vision Engine: A Comprehensive Visual Puzzle Solver
&lt;/h3&gt;

&lt;p&gt;The &lt;a href="https://docs.capsolver.com/en/guide/recognition/VisionEngine/" rel="noopener noreferrer"&gt;Vision Engine&lt;/a&gt; represents CapSolver's flagship offering for addressing interactive visual challenges. It incorporates diverse modules, each specifically designed to tackle distinct puzzle categories:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;slider_1&lt;/strong&gt;: Accurately computes the necessary distance to align a slider puzzle piece with its corresponding background.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;rotate_1 &amp;amp; rotate_2&lt;/strong&gt;: Determines the precise angle required for rotating single or concentric images to their correct orientation.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;shein&lt;/strong&gt;: Identifies bounding boxes for object selection tasks based on specific query parameters.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;ocr_gif&lt;/strong&gt;: Facilitates text extraction from animated GIFs, a capability where conventional OCR methods typically falter.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a Recognition operation, the Vision Engine provides instantaneous results within a single API call. This eliminates the need for continuous polling or token waiting, thereby ensuring exceptional efficiency for real-time automation scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  ImageToTextTask: Advanced Optical Character Recognition
&lt;/h3&gt;

&lt;p&gt;For visual puzzles necessitating text extraction from static images, CapSolver offers the &lt;a href="https://docs.capsolver.com/en/guide/recognition/ImageToTextTask/" rel="noopener noreferrer"&gt;ImageToTextTask&lt;/a&gt; API. This API supports a variety of specialized modules, including a dedicated &lt;code&gt;number&lt;/code&gt; module that achieves over 90% accuracy for numeric captchas. Furthermore, it can concurrently process up to nine images, making it an ideal solution for large-scale data extraction requirements.&lt;/p&gt;

&lt;h3&gt;
  
  
  Comparative Analysis: CapSolver vs. General AI Tools
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;CapSolver Vision Engine&lt;/th&gt;
&lt;th&gt;Generic AI Solvers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Response Time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Instant (Single API Call)&lt;/td&gt;
&lt;td&gt;Delayed (Requires Polling)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Specialized Modules&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes (Slider, Rotate, Object Selection)&lt;/td&gt;
&lt;td&gt;Limited (Primarily basic OCR)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Seamless (REST API, SDKs, n8n)&lt;/td&gt;
&lt;td&gt;Often Complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Accuracy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High (Custom-trained models)&lt;/td&gt;
&lt;td&gt;Variable (Dependent on prompt)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By leveraging these purpose-built tools, developers can confidently rely on CapSolver as the premier AI solution for integrating image puzzle-solving capabilities into their automation workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating Advanced AI for Image Puzzle Solving with n8n
&lt;/h2&gt;

&lt;p&gt;Automation platforms such as n8n offer considerable power and flexibility; however, they frequently encounter limitations when confronted with visual puzzles. The integration of CapSolver with n8n fundamentally transforms these workflows, enabling them to proceed autonomously without requiring manual intervention.&lt;/p&gt;

&lt;p&gt;To effectively implement the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; within an n8n environment, users can leverage the dedicated CapSolver community node. This process involves configuring the node to utilize the Vision Engine operation. Users are required to provide the base64-encoded image, and if applicable, the background image. The node then transmits this data to CapSolver, receiving an immediate solution—such as the precise pixel distance for a slider puzzle.&lt;/p&gt;

&lt;p&gt;This integration is comprehensively detailed in CapSolver's guide on &lt;a href="https://www.capsolver.com/blog/n8n/how-to-use-vision-engine-n8n" rel="noopener noreferrer"&gt;how to use Vision Engine in n8n&lt;/a&gt;. By synergizing n8n's intuitive visual workflow builder with CapSolver's advanced AI capabilities, developers can construct resilient scrapers and automated systems that adeptly manage visual interruptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Implementation: Solving Puzzles with CapSolver
&lt;/h2&gt;

&lt;p&gt;Implementing the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; is streamlined through CapSolver's Python SDK. The following reference implementation, based on official CapSolver documentation, illustrates its ease of use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# pip install --upgrade capsolver
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;capsolver&lt;/span&gt;

&lt;span class="n"&gt;capsolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Example: Solving a slider puzzle using Vision Engine
&lt;/span&gt;&lt;span class="n"&gt;solution&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;capsolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;VisionEngine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;module&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slider_1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base64_encoded_puzzle_piece...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;imageBackground&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base64_encoded_background...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Slider distance: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;\&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;distance&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; pixels&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code snippet demonstrates the straightforward integration of advanced AI for image puzzle solving into Python scripts. The API efficiently handles complex computations, delivering precise, actionable data.&lt;/p&gt;

&lt;blockquote&gt;
&lt;h3&gt;
  
  
  &lt;strong&gt;Unlock Your CapSolver Bonus&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Maximize your automation budget instantly!&lt;br&gt;
Utilize bonus code &lt;strong&gt;CAP26&lt;/strong&gt; during your CapSolver account top-up to receive an additional &lt;strong&gt;5% bonus&lt;/strong&gt; on every recharge—with no limitations.&lt;br&gt;
Redeem your bonus now via your &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=best-ai-image-puzzels"&gt;CapSolver Dashboard&lt;/a&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn76e6pzgold776mms5wf.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fn76e6pzgold776mms5wf.png" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Ensuring Compliance and Ethical Automation
&lt;/h2&gt;

&lt;p&gt;When deploying the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt;, it is imperative to prioritize compliance with regulations and adhere to ethical best practices. Automation should serve to augment productivity, facilitate responsible public data collection, and streamline legitimate business operations. Developers are responsible for ensuring that their automated systems respect website terms of service and do not unduly burden server resources. CapSolver actively advocates for the responsible application of its technology, offering tools that promote efficient and ethical data acquisition. By upholding these principles, organizations can harness AI capabilities in a sustainable manner. For further insights into responsible automation, a comprehensive exploration of the &lt;a href="https://www.capsolver.com/blog/All/ai-powered-image-recognition" rel="noopener noreferrer"&gt;AI-powered image recognition&lt;/a&gt; landscape is recommended.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Future of AI in Visual Recognition
&lt;/h2&gt;

&lt;p&gt;The technological advancements underpinning the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; are continuously evolving. With the &lt;a href="https://finance.yahoo.com/news/image-recognition-market-forecasts-report-090100922.html" rel="noopener noreferrer"&gt;global AI image recognition market projected to surge from USD 57.36 billion in 2025 to USD 109.23 billion by 2030&lt;/a&gt; [2], the industry anticipates the emergence of even more sophisticated models. Future iterations are expected to deliver enhanced accuracy, accelerated processing speeds, and the capacity to resolve increasingly intricate visual logic puzzles.&lt;/p&gt;

&lt;p&gt;As AI models mature, the disparity between human and machine visual comprehension is poised to diminish further. Platforms like CapSolver are at the vanguard of this evolution, consistently updating their modules to address novel challenges. &lt;a href="https://www.statista.com/outlook/tmo/artificial-intelligence/computer-vision/worldwide" rel="noopener noreferrer"&gt;According to Statista, the computer vision market is forecast to experience substantial growth with a CAGR of 12.6%&lt;/a&gt; [3], underscoring the critical importance of staying abreast of these developments for anyone reliant on automated visual recognition solutions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Identifying the &lt;strong&gt;best AI for solving image puzzles&lt;/strong&gt; is indispensable for contemporary automation and data extraction endeavors. &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=best-ai-image-puzzels"&gt;CapSolver&lt;/a&gt; offers the most robust and efficient solutions through its Vision Engine and ImageToTextTask APIs. By providing specialized modules for slider puzzles, rotations, and text recognition, it consistently outperforms generic AI tools in both operational speed and accuracy.&lt;/p&gt;

&lt;p&gt;Integrating these advanced capabilities into platforms like n8n further empowers developers to construct seamless and uninterrupted workflows. As automation projects scale, prioritizing ethical practices and leveraging CapSolver's sophisticated features will be crucial for achieving optimal and sustainable results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What distinguishes CapSolver as the leading AI for solving image puzzles?&lt;/strong&gt;&lt;br&gt;
CapSolver provides dedicated, specialized models, such as the Vision Engine, which instantly compute precise solutions for visual challenges like sliders and rotations. This capability sets it apart from generic OCR tools that are primarily designed for text recognition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How can image puzzle-solving be integrated into n8n workflows?&lt;/strong&gt;&lt;br&gt;
Integration is achieved by utilizing the CapSolver community node within n8n. This node is configured for the Vision Engine operation, allowing users to send base64-encoded images and receive immediate puzzle solutions, such as pixel distances.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the implementation of the CapSolver API in Python complex?&lt;/strong&gt;&lt;br&gt;
No, implementation is straightforward. The official CapSolver Python SDK enables users to solve visual puzzles with minimal lines of code, requiring only the necessary image data and module type.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What types of visual puzzles are solvable by the Vision Engine?&lt;/strong&gt;&lt;br&gt;
The Vision Engine supports a range of modules, including &lt;code&gt;slider_1&lt;/code&gt; for slider puzzles, &lt;code&gt;rotate_1&lt;/code&gt; and &lt;code&gt;rotate_2&lt;/code&gt; for image alignment, &lt;code&gt;shein&lt;/code&gt; for object selection, and &lt;code&gt;ocr_gif&lt;/code&gt; for recognizing text within animated GIFs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the functional difference between ImageToTextTask and Vision Engine?&lt;/strong&gt;&lt;br&gt;
The ImageToTextTask is specifically engineered for extracting text and numerical data from static images (OCR), whereas the Vision Engine is designed to analyze spatial relationships and logical patterns for interactive visual puzzles.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>image</category>
      <category>challenge</category>
    </item>
    <item>
      <title>How to Bypass Cloudflare Turnstile in Vehicle Data Automation</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Thu, 16 Apr 2026 06:33:51 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-to-bypass-cloudflare-turnstile-in-vehicle-data-automation-5a4p</link>
      <guid>https://dev.to/luisgustvo/how-to-bypass-cloudflare-turnstile-in-vehicle-data-automation-5a4p</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwcwn45qfaazgddz6tbyh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwcwn45qfaazgddz6tbyh.png" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Cloudflare Turnstile presents a significant hurdle for automated access to government and vehicle data portals.&lt;/li&gt;
&lt;li&gt;CapSolver offers an AI-powered service to generate valid tokens, bypassing these challenges without manual intervention.&lt;/li&gt;
&lt;li&gt;Seamless integration with automation platforms like n8n facilitates multi-step data scraping and legal data retrieval.&lt;/li&gt;
&lt;li&gt;Utilizing the AntiTurnstileTaskProxyLess task type optimizes cost-efficiency and simplifies technical infrastructure.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=public-record-data-turnstile"&gt;CapSolver&lt;/a&gt; provides an enterprise-grade solution for stable and compliant high-volume data collection.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;In the contemporary landscape of vehicle data and public records automation, sophisticated security measures are frequently encountered, primarily designed to distinguish between human users and automated systems. Cloudflare Turnstile has emerged as a prominent solution adopted by many websites, implementing a non-interactive challenge that operates discreetly in the background. For professionals such as data engineers and legal technology analysts, mastering how to bypass Cloudflare Turnstile within vehicle data and public records automation workflows is crucial for sustaining uninterrupted data streams.&lt;/p&gt;

&lt;p&gt;CapSolver delivers a specialized, AI-driven service that automates bypassing these challenges, thereby enabling scripts to execute without interruption. The CapSolver API, complemented by its official n8n integration, stands out as an exceptionally efficient tool for managing extensive public records retrieval while upholding technical stability. This guide aims to elucidate the integration of these solutions into existing workflows, maximizing reliability and cost-effectiveness.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Proliferation of Cloudflare Turnstile in Public Data Portals
&lt;/h2&gt;

&lt;p&gt;Government entities and providers of vehicle history data are increasingly implementing Cloudflare Turnstile as a fundamental component of their security and verification frameworks for public-facing data access. Turnstile employs a combination of browser signals and user interaction patterns to evaluate the legitimacy of requests, offering a more streamlined alternative to conventional CAPTCHA methods that typically rely on visual puzzles.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Challenge Type&lt;/th&gt;
&lt;th&gt;User Interaction&lt;/th&gt;
&lt;th&gt;Detection Method&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Managed&lt;/td&gt;
&lt;td&gt;No direct user interaction&lt;/td&gt;
&lt;td&gt;Browser fingerprinting signals&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-Interactive&lt;/td&gt;
&lt;td&gt;No visible challenge&lt;/td&gt;
&lt;td&gt;Behavioral and risk-based analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Invisible&lt;/td&gt;
&lt;td&gt;Fully background verification&lt;/td&gt;
&lt;td&gt;Continuous session-based evaluation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These operational modes are engineered to function with minimal disruption to end-users, while simultaneously applying varying degrees of risk assessment contingent on the context of the request.&lt;/p&gt;

&lt;p&gt;For a broader understanding of the evolution of automated traffic detection and bot mitigation strategies across diverse industries, refer to &lt;strong&gt;Cybersecurity and Automation Trends – Statista&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For teams engaged in determining how to manage Turnstile within vehicle data and public records workflows, comprehending these verification modes constitutes a foundational step in developing more dependable and resilient automation systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Limitations of Conventional Scraping Against Turnstile
&lt;/h2&gt;

&lt;p&gt;Traditional web scraping techniques frequently encounter failure when confronted with Cloudflare Turnstile, primarily because they are unable to adequately address the cryptographic challenges issued by Cloudflare. Even advanced headless browsers can be identified and blocked if their operational signals do not precisely align with expected browser behaviors. This often results in blocked requests, premature session terminations, and incomplete datasets within vehicle history or court record databases.&lt;/p&gt;

&lt;p&gt;Turnstile is specifically designed to detect indicators of automation, such as the absence of typical browser features, anomalous request headers, or inconsistent timing patterns. Without a specialized bypassing mechanism, automated processes are highly likely to become ensnared in an unending cycle of verification attempts. This underscores the necessity of a professional service to bridge the gap between rudimentary automation efforts and successful data acquisition. More information on overcoming such challenges can be found in this article: &lt;a href="https://www.capsolver.com/blog/Cloudflare/solve-cloudflare-in-2026" rel="noopener noreferrer"&gt;Solving Cloudflare Challenges in 2026&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Automating Solutions with CapSolver API
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=public-record-data-turnstile"&gt;CapSolver&lt;/a&gt; provides a streamlined API that manages the complexities of bypassing Turnstile. The primary method involves the &lt;code&gt;AntiTurnstileTaskProxyLess&lt;/code&gt; task type, which is both cost-effective and straightforward to implement. By supplying the target &lt;code&gt;websiteURL&lt;/code&gt; and the site's unique &lt;code&gt;websiteKey&lt;/code&gt;, a valid token can be obtained, allowing your scraper to proceed unimpeded.&lt;/p&gt;

&lt;p&gt;This process is designed for speed and reliability. Below is a comprehensive Python example utilizing the &lt;code&gt;requests&lt;/code&gt; library to initiate and monitor a bypassing task:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;

&lt;span class="c1"&gt;# Configuration
&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;WEBSITE_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;0x4XXXXXXXXXXXXXXXXX&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;WEBSITE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.yourwebsite.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_turnstile_task&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AntiTurnstileTaskProxyLess&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;websiteKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;WEBSITE_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;websiteURL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;WEBSITE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;login&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# Optional action parameter
&lt;/span&gt;            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.capsolver.com/createTask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;taskId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error creating task: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_task_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;taskId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.capsolver.com/getTaskResult&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ready&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task solved successfully!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task failed to solve.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Task still processing, waiting 2 seconds...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error getting task result: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c1"&gt;# Main execution
&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_turnstile_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_task_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generated Token: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This implementation is a crucial component for developers who prefer custom code when addressing Cloudflare Turnstile in vehicle data and public records automation. For those operating within a JavaScript environment, the subsequent Node.js example illustrates a comparable asynchronous workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="err"&gt;\&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;axios&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;);

const API_KEY = "YOUR_API_KEY";
const WEBSITE_KEY = "0x4XXXXXXXXXXXXXXXXX";
const WEBSITE_URL = "https://www.yourwebsite.com";

async function solveTurnstile() {
    try {
        // Create task
        const taskResponse = await axios.post(&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;https://api.capsolver.com/createTask&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;, {
            clientKey: API_KEY,
            task: {
                type: &lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;AntiTurnstileTaskProxyLess&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;,
                websiteKey: WEBSITE_KEY,
                websiteURL: WEBSITE_URL
            }
        });

        const taskId = taskResponse.data.taskId;
        console.log(`Task created: ${taskId}`);

        // Poll for result
        while (true) {
            const resultResponse = await axios.post(&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;https://api.capsolver.com/getTaskResult&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;, {
                clientKey: API_KEY,
                taskId: taskId
            });

            if (resultResponse.data.status === &lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;ready&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;) {
                return resultResponse.data.solution.token;
            } else if (resultResponse.data.status === &lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;failed&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;) {
                throw new Error(&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;Task failed&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;);
            }

            console.log(&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;Waiting for solution...&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;);
            await new Promise(resolve =&amp;gt; setTimeout(resolve, 2000));
        }
    } catch (error) {
        console.error(&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;Error solving Turnstile:&lt;/span&gt;&lt;span class="se"&gt;\'&lt;/span&gt;&lt;span class="s1"&gt;, error.message);
    }
}

solveTurnstile().then(token =&amp;gt; {
    if (token) console.log(`Token: ${token}`);
});
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CapSolver: An Enterprise-Grade Solution
&lt;/h2&gt;

&lt;p&gt;For large-scale data operations, the consistency and reliability of solutions are paramount. CapSolver functions as an enterprise-level platform, guaranteeing that high-volume data collection remains both stable and technically compliant. In contrast to smaller, less robust services, CapSolver furnishes the necessary infrastructure to manage millions of requests without any degradation in performance. This makes it the preferred option for legal technology firms and insurance providers who cannot tolerate downtime or data loss.&lt;/p&gt;

&lt;p&gt;The platform's AI models undergo continuous updates to effectively address new variations of Turnstile challenges, thereby establishing a future-proof foundation for automation projects. By delegating the complexities of CAPTCHA bypassing to an enterprise-grade service, teams can redirect their focus towards extracting valuable insights from data, rather than expending resources on debugging technical obstacles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Constructing Workflows with n8n and CapSolver
&lt;/h2&gt;

&lt;p&gt;For teams that favor a visual methodology for automation, n8n presents a potent alternative to developing custom scripts. CapSolver is integrated as an official component within n8n, enabling users to effortlessly incorporate a bypasser node directly into their vehicle data scraping workflows. This feature proves particularly advantageous for intricate multi-step processes, such as authenticating into a government portal prior to searching for public records.&lt;/p&gt;

&lt;p&gt;By consulting the guide on &lt;a href="https://www.capsolver.com/blog/n8n/how-to-solve-cloudflare-turnstile-n8n" rel="noopener noreferrer"&gt;how to bypass Cloudflare Turnstile using CapSolver and n8n&lt;/a&gt;, users can construct a reusable bypasser API or embed the bypasser directly into their data collection pipelines. This approach minimizes maintenance time and allows non-technical team members to comprehend and manage the underlying automation logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Case Study: Automating Accident Report Retrieval
&lt;/h2&gt;

&lt;p&gt;Within the legal and insurance sectors, the retrieval of accident reports constitutes a high-volume operation frequently impeded by Turnstile challenges. These reports are indispensable for processing claims and constructing legal arguments. When these portals deploy Turnstile, manual retrieval processes become a significant bottleneck. By integrating an automated bypasser, legal technology firms can acquire these reports at scale, ensuring that crucial information is accessible promptly upon its publication.&lt;/p&gt;

&lt;p&gt;This automation substantially diminishes the manual workload and enhances the precision of data entry. Furthermore, it guarantees that firms can manage thousands of queries daily without encountering obstructions from security protocols. This serves as a practical illustration of how to effectively manage Cloudflare Turnstile in vehicle data and public records automation to generate tangible business value.&lt;/p&gt;

&lt;h2&gt;
  
  
  Comparative Analysis: CapSolver vs. Traditional Verification Methods
&lt;/h2&gt;

&lt;p&gt;When formulating a strategy for public records automation, it is imperative to evaluate the efficacy of automated bypassers against manual approaches or rudimentary scripting solutions.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;CapSolver AI&lt;/th&gt;
&lt;th&gt;Manual Entry&lt;/th&gt;
&lt;th&gt;Basic Scripting&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;1–10 Seconds&lt;/td&gt;
&lt;td&gt;1–2 Minutes&lt;/td&gt;
&lt;td&gt;High Failure Rate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Low (Per 1k)&lt;/td&gt;
&lt;td&gt;High (Labor)&lt;/td&gt;
&lt;td&gt;Variable (Maintenance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Unlimited&lt;/td&gt;
&lt;td&gt;Limited by Staff&lt;/td&gt;
&lt;td&gt;Difficult to Scale&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy&lt;/td&gt;
&lt;td&gt;99%+&lt;/td&gt;
&lt;td&gt;Human Error Prone&lt;/td&gt;
&lt;td&gt;Low Reliability&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;As illustrated in the table, CapSolver offers an optimal balance of speed and cost-efficiency, rendering it the preferred choice for tasks involving high volumes of data. Further details regarding performance metrics can be found in the &lt;a href="https://www.capsolver.com/blog/All/captcha-solving-api-performance-comparion" rel="noopener noreferrer"&gt;CAPTCHA bypassing API performance comparison&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Utilize code &lt;code&gt;CAP26&lt;/code&gt; upon registration at &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign="&gt;CapSolver&lt;/a&gt; to receive supplementary credits!&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hqu2dxuqfjka6qcbky2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5hqu2dxuqfjka6qcbky2.png" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Compliance and Ethical Automation in Public Records
&lt;/h2&gt;

&lt;p&gt;Sustaining an effective automation strategy necessitates a strong emphasis on compliance and ethical data collection practices. While CapSolver assists in navigating technical barriers, the responsibility for ensuring that scraping activities adhere to relevant data protection laws rests with the user. This is particularly pertinent when dealing with sensitive legal and vehicle data.&lt;/p&gt;

&lt;p&gt;Employing high-quality proxies and maintaining judicious request rates are considered essential best practices. Such measures mitigate the load on target servers and diminish the probability of an IP address being flagged as suspicious.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Proficiency in managing Cloudflare Turnstile within vehicle data and public records automation is an indispensable capability for any organization driven by data. By strategically utilizing CapSolver’s AI-powered API and its seamless integration with n8n, organizations can effortlessly surmount security obstacles and ensure a consistent influx of high-quality data. This professional methodology guarantees that automation efforts are both efficient and robust.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does bypassing Turnstile necessitate a proxy?
&lt;/h3&gt;

&lt;p&gt;No, the &lt;code&gt;AntiTurnstileTaskProxyLess&lt;/code&gt; task type used by CapSolver for bypassing does not require you to provide your own proxy. This design simplifies the setup process and contributes to reduced infrastructure expenditures.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is integration with Python-based scrapers feasible for CapSolver?
&lt;/h3&gt;

&lt;p&gt;Absolutely. CapSolver offers a comprehensive SDK and a REST API, facilitating straightforward integration with popular programming languages such as Python, Node.js, and Go.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is n8n better than custom code for bypassing Turnstile in vehicle data automation?
&lt;/h3&gt;

&lt;p&gt;The optimal choice largely depends on the specific skill set of your team. n8n excels in visual workflow management and rapid integration, whereas custom code provides greater flexibility for implementing complex logic.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I find the Turnstile &lt;code&gt;websiteKey&lt;/code&gt; to bypass it?
&lt;/h3&gt;

&lt;p&gt;You can find the &lt;code&gt;websiteKey&lt;/code&gt; by inspecting the target page’s HTML and looking for the Turnstile widget element, which usually contains a &lt;code&gt;data-sitekey&lt;/code&gt; attribute. Alternatively, the CapSolver browser extension can identify it for you automatically.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the success rate for bypassing Turnstile on public record portals?
&lt;/h3&gt;

&lt;p&gt;CapSolver maintains a very high success rate for bypassing Turnstile challenges, often exceeding 99%. This ensures the sustained reliability of your automation, even when targeting highly secure government portals.&lt;/p&gt;

</description>
      <category>turnstile</category>
      <category>data</category>
      <category>ai</category>
    </item>
    <item>
      <title>Agentic RAG: From Smart Q&amp;A to Self-Governing AI Decisions</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Thu, 09 Apr 2026 07:57:09 +0000</pubDate>
      <link>https://dev.to/luisgustvo/agentic-rag-from-smart-qa-to-self-governing-ai-decisions-181k</link>
      <guid>https://dev.to/luisgustvo/agentic-rag-from-smart-qa-to-self-governing-ai-decisions-181k</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvzln9bgyutkvrj6jbws.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzvzln9bgyutkvrj6jbws.jpg" alt="What is Agentic RAG?" width="800" height="447"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Consider yourself the chief executive of a major corporation. Your organization possesses a wealth of knowledge—documents, reports, customer insights, and market analyses spanning decades. However, these invaluable assets are often fragmented across disparate systems, leading employees to spend considerable time daily just searching for information. Furthermore, when you query an AI assistant, asking, for instance, "What was our customer satisfaction like in a specific region last quarter?" you might receive either an unhelpful response or fabricated data.&lt;/p&gt;

&lt;p&gt;This fundamental challenge is precisely what Retrieval-Augmented Generation (RAG) technology seeks to address. This piece will explore the three evolutionary stages of RAG—Basic RAG, Graph RAG, and Agentic RAG—illustrating how each functions as a distinct tier of enterprise consultant, progressively elevating AI's intelligence and its contribution to business value.&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 1: A Comprehensive Overview of the Three Primary RAG Architectures
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1.1 Basic RAG: The Enterprise's "Intelligent Information Specialist"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Architectural Diagram:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feon12oggverv6u2w2pkl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Feon12oggverv6u2w2pkl.png" alt="Basic RAG Architecture" width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fundamental Mechanism:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Phase 1:&lt;/strong&gt; You submit a question (Query).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 2:&lt;/strong&gt; The system retrieves pertinent information from its knowledge repository (Search Relevant Information).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 3:&lt;/strong&gt; This retrieved content, along with your original question, is then provided to a Large Language Model (LLM).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 4:&lt;/strong&gt; The LLM subsequently generates an accurate, evidence-backed answer.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Basic RAG can be likened to a diligent information specialist. If you inquire about "a company's financial standing," it promptly consults its archives for the latest annual reports, financial statements, and relevant analyses, presenting these materials for your review. It does not invent data but ensures that every piece of information is verifiable. For organizations embarking on this journey, understanding &lt;a href="https://www.capsolver.com/blog/AI/ai-llm-practice?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;how AI LLM practices&lt;/a&gt; integrate with these retrieval systems marks the initial step towards mitigating hallucinations.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.2 Graph RAG: The Enterprise's "Strategic Insights Analyst"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Architectural Diagram:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fio4kn3p0jdm66wb4xf1v.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fio4kn3p0jdm66wb4xf1v.png" alt="Graph RAG Architecture" width="720" height="357"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fundamental Mechanism:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Phase 1:&lt;/strong&gt; You pose a question (Query), and the system automatically identifies key entities and their relational intentions (e.g., "competitors," "supply chain," "investment ties").&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 2:&lt;/strong&gt; The system conducts graph traversal retrieval within a knowledge graph, not only locating relevant text but also uncovering multi-hop relationship paths between entities (e.g., A → Supplier → B → Shareholder → C).&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 3:&lt;/strong&gt; The retrieved structured relational evidence (entities + relationships + attributes) is then passed to the LLM alongside the original question, forming a "relationship-enriched context."&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 4:&lt;/strong&gt; The LLM generates an answer grounded in the network logic of these relationships, explaining not just "what" but also "why" and "what else is connected."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Graph RAG operates much like a strategic insights analyst skilled in understanding complex interconnections. It doesn't merely know "Jack works at Company A"; it comprehends that "Jack is the CTO of Company A, Company A and Company B are rivals, and Company B recently secured investment from Company C." When asked "Who is Jack?", it analyzes the entire relational network to offer profound insights. This progression is part of a broader trend where &lt;a href="https://nstarxinc.com/blog/the-next-frontier-of-rag-how-enterprise-knowledge-systems-will-evolve-2026-2030/" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;enterprise knowledge systems are evolving&lt;/strong&gt;&lt;/a&gt; to manage intricate, theme-level inquiries.&lt;/p&gt;

&lt;h3&gt;
  
  
  1.3 Agentic RAG: The Enterprise's "Autonomous Project Lead"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Architectural Principle:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6k4ait90c2wmda7xhfn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe6k4ait90c2wmda7xhfn.png" alt="Agentic RAG Architecture" width="800" height="442"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Core Mechanism:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Phase 1:&lt;/strong&gt; You present a complex task or question (Prompt + Query). The system not only grasps the intent but also pinpoints the actionable goals to be executed.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 2:&lt;/strong&gt; The system independently devises a task pathway and orchestrates multiple AI agents to invoke tools/data sources (e.g., search, databases, APIs) for dynamic information retrieval.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 3:&lt;/strong&gt; The integrated execution outcomes from various sources (including retrieved content, tool-generated data, and both long-term and short-term memory) are compiled into an augmented context and provided to the LLM.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Phase 4:&lt;/strong&gt; The LLM produces an actionable, iterative final response or an execution plan, capable of self-correction based on feedback (ReAct/CoT).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In contrast to Basic and Graph RAG, Agentic RAG functions more like a highly independent project lead. When you instruct it to "Help me formulate next quarter's marketing strategy," it doesn't just retrieve documents; it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Self-Plans:&lt;/strong&gt; Breaks down the objective into sub-tasks such as "analyze previous quarter's data → research competitors → define user personas → draft the plan."&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Utilizes Tools:&lt;/strong&gt; Automatically accesses the CRM system, employs data analysis tools, and searches for market reports.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Iteratively Refines:&lt;/strong&gt; Adjusts subsequent steps based on the outcomes of each stage.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Delivers Results:&lt;/strong&gt; Ultimately presents a comprehensive market analysis report and promotional strategy.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Chapter 2: From RAG to Agentic RAG: The Inevitable Progression of Enterprise Intelligence
&lt;/h2&gt;

&lt;h3&gt;
  
  
  2.1 Evolutionary Trajectory: Why RAG Must Advance Towards "Autonomous Agents"
&lt;/h3&gt;

&lt;p&gt;Retrieval-Augmented Generation (RAG) technology emerged to tackle the issues of LLM "hallucinations" and outdated knowledge. Early Basic RAG acted as an efficient information clerk—you inquire, it searches the knowledge base, and delivers the findings to the LLM. This significantly boosted accuracy and lowered hallucination risks by over 70%, yielding an ROI of 150%–300%.&lt;/p&gt;

&lt;p&gt;However, as business complexities grew, enterprises encountered Basic RAG's limitation: it could only answer "what," struggling with "why" and "what else." This led to the development of Graph RAG, which superimposed a knowledge graph onto vector retrieval to trace multi-hop relationships. This capability supports intricate reasoning tasks such as identifying fraud networks and understanding supply chain risk propagation, enhancing relationship mining depth by threefold.&lt;/p&gt;

&lt;p&gt;Yet, Graph RAG remains a passive system—it requires human prompts and only offers analytical conclusions without initiating actions. When businesses desire AI not just to "analyze" but also to "act," Agentic RAG becomes the logical next step. It introduces three fundamental capabilities:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Autonomous Task Decomposition:&lt;/strong&gt; Automatically deconstructs ambiguous, complex objectives into executable sequences of sub-tasks.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;External Tool Integration:&lt;/strong&gt; Connects to external systems like CRM, ERP, BI, web browsers, and APIs via protocols such as MCP to actively fetch data and perform operations.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Dynamic Adaptation:&lt;/strong&gt; Self-corrects strategies based on intermediate results without requiring human intervention.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This evolution from an "information retrieval utility" to a "relational reasoning consultant" and then to an "autonomous action agent" is crucial for developing "digital employees" capable of end-to-end operations. Leading platforms are already identifying the &lt;a href="https://www.capsolver.com/blog/AI/best-ai-agents?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;most effective AI agents&lt;/a&gt; that can manage these intricate workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.2 Advantages and Disadvantages: Why Agentic RAG is Gaining Prominence
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Basic RAG&lt;/th&gt;
&lt;th&gt;Graph RAG&lt;/th&gt;
&lt;th&gt;Agentic RAG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Benefits&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;• Rapid deployment, minimal cost&lt;br&gt;• Substantial reduction in hallucinations&lt;br&gt;• Real-time access to operational data&lt;/td&gt;
&lt;td&gt;• Profound relational reasoning&lt;br&gt;• Uncovers hidden connections (e.g., fraud patterns)&lt;br&gt;• High degree of explainability&lt;/td&gt;
&lt;td&gt;• End-to-end automation, 50–80% labor savings&lt;br&gt;• Integrates CRM/ERP/BI systems&lt;br&gt;• Adapts dynamically to environmental shifts&lt;br&gt;• A single agent can manage numerous tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Drawbacks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;• Incapable of handling multi-hop complex queries&lt;br&gt;• Retrieval quality dependent on vector precision&lt;br&gt;• Lacks action execution capability&lt;/td&gt;
&lt;td&gt;• High expenses for knowledge graph construction/maintenance&lt;br&gt;• Still limited to passive analysis, cannot execute actions&lt;br&gt;• Underutilization of unstructured data&lt;/td&gt;
&lt;td&gt;• High computational demands (+40–80% cost)&lt;br&gt;• Autonomous decisions necessitate human oversight&lt;br&gt;• Longer deployment timeframe (3–6 months)&lt;br&gt;• Must manage tool call exceptions (e.g., CAPTCHAs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ROI Range&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;150–300%&lt;/td&gt;
&lt;td&gt;200–400%&lt;/td&gt;
&lt;td&gt;300–600%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;While Agentic RAG demands a higher initial investment, its gains in efficiency (over 80% workflow automation) and labor savings significantly surpass those of other RAG forms. It can accomplish tasks that Basic and Graph RAG simply cannot—such as automatically monitoring inventory, generating purchase orders, and adjusting pricing. This "query-to-action" cycle positions it as the most commercially appealing direction, as highlighted in &lt;a href="https://www.impactanalytics.ai/blog/agentic-rag" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;reports on Agentic RAG's enterprise advantages&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2.3 Practical Validation: Why Agentic RAG is the "Most Comprehensive and Applicable" Enterprise AI Solution
&lt;/h3&gt;

&lt;p&gt;Agentic RAG can permeate nearly all enterprise processes that involve "human + system" collaboration—including customer service, internal knowledge management, sales, marketing, financial risk control, and research &amp;amp; development.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability Aspect&lt;/th&gt;
&lt;th&gt;Basic RAG&lt;/th&gt;
&lt;th&gt;Graph RAG&lt;/th&gt;
&lt;th&gt;Agentic RAG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Primary Task Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single-hop Q&amp;amp;A, factual lookup&lt;/td&gt;
&lt;td&gt;Multi-hop reasoning, relationship discovery&lt;/td&gt;
&lt;td&gt;Multi-step, cross-system, closed-loop execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Interaction Paradigm&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Passive response&lt;/td&gt;
&lt;td&gt;Passive response&lt;/td&gt;
&lt;td&gt;Active planning + execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Scope&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Static knowledge bases/documents&lt;/td&gt;
&lt;td&gt;Knowledge graph + documents&lt;/td&gt;
&lt;td&gt;Multi-source heterogeneous systems (real-time)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Automated Tool/API Invocation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Handling Open-Ended Long Workflows&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;Partial (reasoning only)&lt;/td&gt;
&lt;td&gt;✅ (including actions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Typical Task Completion Rate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;95%+ (for simple tasks)&lt;/td&gt;
&lt;td&gt;70–85% (for complex reasoning)&lt;/td&gt;
&lt;td&gt;80–95% (for end-to-end complex tasks)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deployment Duration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;2–4 weeks&lt;/td&gt;
&lt;td&gt;2–3 months&lt;/td&gt;
&lt;td&gt;3–6 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Applicable Scenarios&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;30+&lt;/td&gt;
&lt;td&gt;15–20&lt;/td&gt;
&lt;td&gt;50+ (encompassing almost all business functions)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Agentic RAG integrates retrieval, analysis, and execution into a cohesive business cycle. For instance, starting from a customer inquiry, it can automatically access the knowledge base, diagnose the issue, create a support ticket, update CRM tags, and trigger a personalized resolution. By interfacing with enterprise systems, it achieves multi-system synergy and self-correction based on feedback, elevating AI from a mere "search utility" to a truly executable "intelligent agent."&lt;/p&gt;




&lt;h2&gt;
  
  
  Chapter 3: Overcoming Data Barriers: How Agentic RAG Navigates CAPTCHAs for Global Data Acquisition
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 The Discrepancy Between Ideal and Reality: The Unseen Limit of the MCP Toolchain
&lt;/h3&gt;

&lt;p&gt;Agentic RAG is lauded as the closest manifestation of a "true intelligent agent." However, when this "autonomous project lead" attempts to access web pages via the &lt;a href="https://www.anthropic.com/news/model-context-protocol" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt;&lt;/a&gt; to gather real-time market intelligence or competitor dynamics, a straightforward yet frustrating obstacle emerges: CAPTCHAs.&lt;/p&gt;

&lt;p&gt;Imagine your Agentic RAG system is tasked with "analyzing competitor Q3 financial reports and formulating a response strategy." It confidently plans: Step 1, locate the latest reports; Step 2, scrape the official website; Step 3, cross-reference industry data. Yet, upon accessing the target site through an MCP tool, it's met not with data, but with a silent &lt;a href="https://www.capsolver.com/blog/reCAPTCHA/high-score-recaptcha-v3?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;reCAPTCHA v3 score&lt;/a&gt; or a &lt;a href="https://www.capsolver.com/blog/Cloudflare/how-to-pass-cloudflare-verifying-you-are-human?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;Cloudflare Turnstile "Please verify you are human" prompt&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;This represents a universal predicament for Agentic RAG in real-world web environments:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Data Access Obstacles:&lt;/strong&gt; High-value commercial information is frequently protected by CAPTCHAs. CAPTCHAs are designed as "human-machine differentiation tests," and autonomous agents are, by definition, "machines."&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Rate Limiting:&lt;/strong&gt; Frequent access easily triggers anti-scraping mechanisms, often resulting in IP bans.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Diversity of Challenges:&lt;/strong&gt; CAPTCHAs vary from simple text to complex semantic selections. No single strategy can effectively manage all scenarios.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If Agentic RAG cannot overcome this "digital gatekeeper," its capacity for autonomous action will be stalled at the outset, and its reasoning will remain theoretical. This is &lt;a href="https://www.capsolver.com/blog/AI/why-web-automation-keeps-failing-on-captcha?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;why web automation consistently fails on CAPTCHA&lt;/a&gt; without specialized solutions.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.2 CapSolver: Empowering Autonomous Agents with "Intelligent Access Keys"
&lt;/h3&gt;

&lt;p&gt;How can Agentic RAG efficiently and reliably bypass CAPTCHA hurdles without compromising compliance? The solution lies in integrating specialized CAPTCHA-solving tools like &lt;strong&gt;&lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;CapSolver&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsg3lgs3n4shrhlujhn23.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsg3lgs3n4shrhlujhn23.png" width="800" height="121"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If Agentic RAG is a market researcher, then CapSolver serves as its "passport specialist." Regardless of whether a website employs reCAPTCHA, Cloudflare Turnstile, or AWS WAF, CapSolver can swiftly provide a "passport." It acts as a "locksmith" proficient in all entry systems, capable of:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Identifying Numerous CAPTCHA Variants:&lt;/strong&gt; Including reCAPTCHA v2/v3, AWS WAF, Cloudflare, image selection, slider simulations, and more.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Millisecond Responsiveness:&lt;/strong&gt; Real-time analysis via AI models to deliver verification tokens.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Cost-Effective, High Success Rate:&lt;/strong&gt; An average success rate exceeding 90%, with costs significantly lower than manual processing.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When an Agentic RAG's MCP tool encounters a CAPTCHA, CapSolver, designed for automation, is integrated into the toolchain. The system automatically transmits the CAPTCHA context to CapSolver, which resolves it in milliseconds, allowing the agent to proceed unimpeded.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;CapSolver Performance&lt;/th&gt;
&lt;th&gt;Value Proposition for Agentic RAG&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Supported Types&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;reCAPTCHA, Cloudflare, AWS WAF, GeeTest, etc. (20+ types)&lt;/td&gt;
&lt;td&gt;Covers over 95% of prevalent scenarios; eliminates the need for site-specific custom logic.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Accuracy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Overall success rate ≥ 96%&lt;/td&gt;
&lt;td&gt;Task failure rate less than 5%, preventing workflow disruptions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Response Speed&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple: &amp;lt; 1s; reCAPTCHA: &amp;lt; 3s; Complex: 4–6s&lt;/td&gt;
&lt;td&gt;5–10 times faster than manual input, ensuring real-time performance for &lt;a href="https://www.capsolver.com/blog/AI/solving-captchas-for-price-monitoring-ai-agents?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;AI agents monitoring prices&lt;/a&gt;.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The entire process remains transparent to the higher-level business logic. Agentic RAG maintains its "plan → execute → optimize" cycle as if the CAPTCHA never existed.&lt;/p&gt;

&lt;h3&gt;
  
  
  3.3 Integration Value: Truly Connecting Agentic RAG to Real-World Data
&lt;/h3&gt;

&lt;p&gt;Integrating CapSolver into the Agentic RAG MCP toolchain is more than just a functional addition; it is the crucial infrastructure that enables intelligent agents to operate effectively on the open internet. This integration delivers three core levels of value:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Firstly, a substantial increase in task completion rates.&lt;/strong&gt;&lt;br&gt;
Without CAPTCHA recognition, automation success rates often fall below 60%. With CapSolver, AI agents can access web pages as smoothly as human users, elevating end-to-end success rates to 92%–97%. This is essential for continuous, unattended operation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secondly, the full realization of real-time data acquisition capabilities.&lt;/strong&gt;&lt;br&gt;
Many applications, such as financial surveillance or competitive price tracking, demand highly current data. CapSolver's millisecond recognition allows Agentic RAG to obtain the latest information without delay. For corporate decision-making, this translates to data updates in minutes rather than days. Developers can learn more about &lt;a href="https://www.capsolver.com/blog/AI/integrating-capsolver-with-webmcp?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;integrating CapSolver with WebMCP&lt;/a&gt; to achieve this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thirdly, the cost advantage for large-scale automated operations.&lt;/strong&gt;&lt;br&gt;
Manual CAPTCHA resolution typically costs $0.05–$0.20 per instance. CapSolver's automated methodology costs approximately $0.0002–$0.002, representing a 1/100th to 1/250th reduction compared to manual efforts. In scenarios involving extensive data collection, this difference is monumental, decreasing overall system operational costs by 40%–60%.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Experience it yourself! Use code &lt;code&gt;CAP26&lt;/code&gt; when registering at &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;CapSolver&lt;/a&gt; to receive bonus credits!&lt;/strong&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsc8s07utx4yajxrcj87.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjsc8s07utx4yajxrcj87.png" width="472" height="140"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In essence, this integration transforms Agentic RAG from a "conceptual agent" into an &lt;strong&gt;enterprise-grade automated data system&lt;/strong&gt; capable of sustained operation in dynamic network environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;From Basic RAG to Graph RAG, and ultimately to Agentic RAG, we have observed the evolution of AI in enterprise knowledge management—progressing from a simple query tool to a relational reasoning consultant, and finally to a "digital employee" that can autonomously plan, execute, and iterate. Throughout this journey, Agentic RAG not only integrates diverse data but also leverages &lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agentic-rag"&gt;CapSolver&lt;/a&gt; to overcome CAPTCHA barriers, providing real-time, comprehensive, and actionable intelligent decision support.&lt;/p&gt;

&lt;p&gt;When AI truly embodies the "understand-execute-self-optimize" loop, enterprises no longer depend solely on manual search and analysis. They gain a 24/7, cost-effective, and highly efficient intelligent assistant that brings knowledge assets to life, fostering business innovation. The synergy of Agentic RAG and CapSolver makes this vision a tangible reality—intelligent agents are becoming a pivotal force for enterprises seeking a competitive edge.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. What distinguishes Basic RAG from Agentic RAG?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Basic RAG functions as a passive information retrieval system, answering direct questions by locating relevant documents. Agentic RAG, conversely, is an active, autonomous system capable of comprehending complex objectives, breaking them into sequential steps, utilizing various tools (such as web browsers or APIs), and executing a plan from inception to completion, much like a human project manager.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Why is Agentic RAG considered the future of enterprise AI?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Agentic RAG is regarded as the future because it transcends simple data retrieval to achieve end-to-end task automation. It can connect disparate enterprise systems (CRM, ERP, BI), act upon information, and adapt to new circumstances without human intervention. This creates a "digital workforce" capable of managing complex workflows, leading to substantial efficiency gains and cost reductions (50-80% labor savings).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. What is the primary challenge for Agentic RAG in practical applications?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The foremost challenge involves accessing live, real-world data from the internet, as much of it is safeguarded by CAPTCHAs and other anti-bot measures. Without the ability to circumvent these barriers, an Agentic RAG system cannot reliably gather the external information necessary to perform tasks like market analysis, competitor tracking, or price monitoring.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. How does CapSolver assist Agentic RAG?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;CapSolver acts as a specialized tool within the Agentic RAG's toolchain, providing an "intelligent key" to bypass CAPTCHAs. When the AI agent encounters a CAPTCHA, it automatically invokes the CapSolver API to resolve it in real-time. This enables the agent to seamlessly access protected websites, ensuring high task completion rates (over 92%) and facilitating genuine automation on the open internet.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Is Agentic RAG challenging to implement?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Compared to Basic RAG, Agentic RAG is more intricate and has a longer deployment cycle (3–6 months). It demands greater computational resources and meticulous planning for tool integration and human oversight. However, its potential for a significantly higher ROI (up to 600%) and its capacity to automate entire workflows make it a highly valuable long-term investment for enterprises.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>How to Bypass Any CAPTCHA in HyperBrowser Using CapSolver (Comprehensive Setup Guide)</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Tue, 31 Mar 2026 08:41:31 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-to-bypass-any-captcha-in-hyperbrowser-using-capsolver-comprehensive-setup-guide-5d3h</link>
      <guid>https://dev.to/luisgustvo/how-to-bypass-any-captcha-in-hyperbrowser-using-capsolver-comprehensive-setup-guide-5d3h</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgz0ditrkj43jkgrp0e07.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgz0ditrkj43jkgrp0e07.png" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;AI-driven browser agents are fundamentally transforming how developers engage with the internet. These agents are capable of navigating web pages, completing forms, and extracting data autonomously, from data scraping to workflow automation. However, the appearance of a CAPTCHA invariably halts their progress.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HyperBrowser&lt;/strong&gt; provides cloud-based browser infrastructure specifically engineered for AI agents, offering native CAPTCHA bypassing capabilities for Turnstile and reCAPTCHA. Nevertheless, the internet features a broader spectrum of CAPTCHA types. Challenges such as AWS WAF, GeeTest, various enterprise reCAPTCHA versions, and other anti-bot mechanisms often remain unaddressed by native tools alone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;CapSolver&lt;/a&gt;&lt;/strong&gt; bridges this gap. By directly uploading the CapSolver Chrome extension to HyperBrowser via its extension API, users gain extensive CAPTCHA coverage across all sessions, for every CAPTCHA type, and at any scale, without requiring modifications to their existing automation code.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction to HyperBrowser
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/hyperbrowserai/HyperAgent" rel="noopener noreferrer"&gt;HyperBrowser&lt;/a&gt; is a cloud browser infrastructure platform specifically designed for AI agents. It delivers managed browser sessions with out-of-the-box native Chrome DevTools Protocol (CDP) access, proxy support, and advanced anti-detection features.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Browser Sessions&lt;/strong&gt;: Enables the on-demand creation of isolated browser instances, eliminating the need for local Chrome installations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native CDP Access&lt;/strong&gt;: Facilitates direct connection of Playwright, Puppeteer, or Selenium to cloud sessions via WebSocket.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HyperAgent&lt;/strong&gt;: An integrated AI browser automation agent for executing web tasks using natural language.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anti-Detection Capabilities&lt;/strong&gt;: Incorporates stealth profiles, residential proxies, and fingerprint randomization into every session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chrome Extension Support&lt;/strong&gt;: Offers a robust extension upload API, allowing users to ZIP an extension, upload it, and attach it to any session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable Infrastructure&lt;/strong&gt;: Supports running hundreds of concurrent sessions without the complexities of managing browser pools.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Developers Opt for HyperBrowser
&lt;/h3&gt;

&lt;p&gt;HyperBrowser alleviates the operational overhead associated with browser automation. Instead of managing Chromium binaries, configuring headless modes, rotating proxies, and implementing anti-fingerprinting measures, developers receive a streamlined API that provides a WebSocket URL. This allows for immediate automation by connecting existing Playwright or Puppeteer scripts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Introduction to CapSolver
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;CapSolver&lt;/a&gt; is a leading service for bypassing CAPTCHAs, offering AI-powered solutions to overcome various CAPTCHA challenges. With support for numerous CAPTCHA types and rapid response times, CapSolver integrates seamlessly into automated workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported CAPTCHA Categories
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.capsolver.com/products/recaptchav2" rel="noopener noreferrer"&gt;&lt;strong&gt;reCAPTCHA v2&lt;/strong&gt;&lt;/a&gt; (including image-based and invisible variants)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/recaptchav3" rel="noopener noreferrer"&gt;&lt;strong&gt;reCAPTCHA v3 &amp;amp; v3 Enterprise&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/cloudflare" rel="noopener noreferrer"&gt;&lt;strong&gt;Cloudflare Turnstile&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.capsolver.com/en/guide/captcha/cloudflare_challenge/" rel="noopener noreferrer"&gt;&lt;strong&gt;Cloudflare 5-second Challenge&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/awswaf" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS WAF CAPTCHA&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.capsolver.com/en/guide/captcha/Geetest/" rel="noopener noreferrer"&gt;&lt;strong&gt;GeeTest v3/v4&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.capsolver.com/en/guide/api-server/" rel="noopener noreferrer"&gt;&lt;strong&gt;Other widely adopted CAPTCHA and anti-bot mechanisms&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before initiating the integration setup, ensure the following components are available:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;A HyperBrowser account&lt;/strong&gt; with an associated API key (&lt;a href="https://www.hyperbrowser.ai" rel="noopener noreferrer"&gt;sign up at hyperbrowser.ai&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;A CapSolver account&lt;/strong&gt; with an API key and sufficient credits (&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;sign up here&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;The CapSolver Chrome extension&lt;/strong&gt; downloaded and properly configured.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Node.js 18+&lt;/strong&gt; with &lt;code&gt;@hyperbrowser/sdk&lt;/code&gt; and &lt;code&gt;playwright-core&lt;/code&gt; installed.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @hyperbrowser/sdk playwright-core
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Step-by-Step Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Acquire Your CapSolver API Key
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; Register or log in at &lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;capsolver.com&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt; Navigate to your Dashboard.&lt;/li&gt;
&lt;li&gt; Copy your API key (it follows the format: &lt;code&gt;CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt; Add credits to your account (utilize bonus code &lt;strong&gt;HYPERBROWSER&lt;/strong&gt; for an additional 6% on your initial recharge).&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Step 2: Download and Configure the CapSolver Extension
&lt;/h3&gt;

&lt;p&gt;Download the CapSolver Chrome extension and set it up with your API key:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Visit the &lt;a href="https://github.com/capsolver/capsolver-browser-extension/releases" rel="noopener noreferrer"&gt;CapSolver extension releases on GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt; Download the most recent &lt;code&gt;CapSolver.Browser.Extension-chrome-vX.X.X.zip&lt;/code&gt; file.&lt;/li&gt;
&lt;li&gt; Extract the extension contents:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v&lt;span class="k"&gt;*&lt;/span&gt;.zip &lt;span class="nt"&gt;-d&lt;/span&gt; capsolver-extension/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt; Open &lt;code&gt;capsolver-extension/assets/config.js&lt;/code&gt; and insert your API key:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;defaultConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// your key here&lt;/span&gt;
  &lt;span class="na"&gt;useCapsolver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// ... rest of config&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt; Verify the extension's directory structure:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls &lt;/span&gt;capsolver-extension/manifest.json
&lt;span class="c"&gt;# This file should be present&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Compress the Extension Directory into a ZIP File
&lt;/h3&gt;

&lt;p&gt;HyperBrowser's extension upload API mandates a ZIP file. Package the configured extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;capsolver-extension &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; ../capsolver-extension.zip &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This action generates &lt;code&gt;capsolver-extension.zip&lt;/code&gt; in your project's root directory, ready for upload.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Upload the Extension to HyperBrowser
&lt;/h3&gt;

&lt;p&gt;Utilize the HyperBrowser SDK to upload the extension ZIP file. This is a one-time operation; the returned &lt;code&gt;extensionId&lt;/code&gt; can be reused across all subsequent sessions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hyperbrowser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@hyperbrowser/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hyperbrowser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;HYPERBROWSER_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Upload the CapSolver extension (a single operation)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;extensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;capsolver-extension.zip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Extension ID:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Retain this ID for reuse in every session&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Guidance&lt;/strong&gt;: Store the &lt;code&gt;ext.id&lt;/code&gt; in your environment variables or configuration. Re-uploading is only necessary if the extension version or API key is modified.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Step 5: Establish a Session with the Extension Enabled
&lt;/h3&gt;

&lt;p&gt;Create a HyperBrowser session that incorporates the CapSolver extension:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;extensionIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;useProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requires a paid plan — omit for the free tier&lt;/span&gt;
  &lt;span class="na"&gt;bypassCaptchas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Utilizing CapSolver instead of native bypassing&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Session ID:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;WebSocket URL:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Set &lt;code&gt;bypassCaptchas: false&lt;/code&gt; when using CapSolver to prevent conflicts between the two bypassing mechanisms. For a fallback chain, refer to the "When to Use Native vs CapSolver" section below.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 6: Integrate Playwright with the Session
&lt;/h3&gt;

&lt;p&gt;Connect Playwright to the HyperBrowser session via its WebSocket endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;playwright-core&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connectOverCDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// Navigate to a CAPTCHA-protected web page&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://www.google.com/recaptcha/api2/demo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Allow time for the CapSolver extension to detect and bypass the CAPTCHA&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Submit the form&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#recaptcha-demo-submit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForLoadState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Confirm successful bypass&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;body&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Result:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Expected outcome: the body text should contain "Verification Success"&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 7: Validate on a reCAPTCHA Demonstration Page
&lt;/h3&gt;

&lt;p&gt;Below is a complete end-to-end script that uploads the extension, establishes a session, bypasses a CAPTCHA, and verifies the outcome:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hyperbrowser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@hyperbrowser/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;playwright-core&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;HYPERBROWSER_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;HYPERBROWSER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CAPSOLVER_EXTENSION_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_EXTENSION_ID&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Optional: for reusing an existing ID&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hyperbrowser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HYPERBROWSER_API_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Step 1: Upload extension (or utilize an existing ID)&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;extensionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;CAPSOLVER_EXTENSION_ID&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;extensionId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;extensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;capsolver-extension.zip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nx"&gt;extensionId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Uploaded extension:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;extensionId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Step 2: Create a session with the CapSolver extension&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;extensionIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;extensionId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;useProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requires a paid plan — omit for the free tier&lt;/span&gt;
    &lt;span class="na"&gt;bypassCaptchas&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Session initiated:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Step 3: Connect Playwright&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connectOverCDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;contexts&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pages&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Step 4: Navigate to the reCAPTCHA demonstration page&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Navigating to reCAPTCHA demo...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://www.google.com/recaptcha/api2/demo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 5: Await CapSolver to bypass the CAPTCHA&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Awaiting CapSolver to bypass CAPTCHA...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 6: Submit the form&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Submitting form...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#recaptcha-demo-submit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForLoadState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;networkidle&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 7: Check the outcome&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bodyText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;body&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;bodyText&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Verification Success&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CAPTCHA bypassed successfully!&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Verification result:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bodyText&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Session terminated.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;HYPERBROWSER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your_key npx tsx captcha-test.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Operational Mechanics
&lt;/h2&gt;

&lt;p&gt;Here is a detailed overview of the process, from extension upload to CAPTCHA bypassing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Initial Configuration
  ═══════════════════════════════════════════════════════

  capsolver-extension/           HyperBrowser Cloud
  ├── manifest.json    ──ZIP──►  POST /extensions
  ├── assets/con
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CAPTCHA Persistence (Form Submission Failure)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: The page loads, but the CAPTCHA remains unbypassed after a waiting period, leading to form submission failure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Possible Explanations&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Insufficient wait duration&lt;/strong&gt; — Extend &lt;code&gt;waitForTimeout&lt;/code&gt; to 45-60 seconds.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Invalid API key&lt;/strong&gt; — Access your CapSolver dashboard to confirm the validity of the key.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Inadequate balance&lt;/strong&gt; — Replenish your CapSolver account credits.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Unsupported CAPTCHA type&lt;/strong&gt; — Consult the &lt;a href="https://docs.capsolver.com/en/guide/api-server/" rel="noopener noreferrer"&gt;CapSolver documentation&lt;/a&gt; for a list of supported types.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Session WebSocket Connection Issues
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: &lt;code&gt;chromium.connectOverCDP()&lt;/code&gt; generates a connection error.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resolution&lt;/strong&gt;: Verify that the session is still active. Sessions have a predefined timeout (which varies by plan). If the previous session has expired, create a new one:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connectOverCDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Session expired, initiating a new one...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newSession&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;extensionIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;extensionId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;useProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requires a paid plan — omit for the free tier&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connectOverCDP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;newSession&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Extension Discrepancy: Local vs. HyperBrowser Functionality
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: The CapSolver extension operates correctly when loaded locally in Chrome but fails within HyperBrowser sessions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Possible Explanations&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;&lt;code&gt;config.js&lt;/code&gt; exclusion from ZIP&lt;/strong&gt; — Double-check that the modified &lt;code&gt;assets/config.js&lt;/code&gt; file is included in the ZIP archive.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Network restrictions&lt;/strong&gt; — The extension requires access to &lt;code&gt;api.capsolver.com&lt;/code&gt;. Ensure that the HyperBrowser session's network configuration permits outbound HTTPS connections.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Extension version incompatibility&lt;/strong&gt; — For optimal compatibility, use the latest release of the CapSolver extension.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Recommended Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Upload the Extension Once, Reuse the Identifier
&lt;/h3&gt;

&lt;p&gt;The extension upload is a singular event. Store the &lt;code&gt;extensionId&lt;/code&gt; returned and reuse it across all subsequent sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Upload once&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;extensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;filePath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;capsolver-extension.zip&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CAPSOLVER_EXT_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Reuse for each session&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;targetUrls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;extensionIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_EXT_ID&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="na"&gt;useProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requires a paid plan — omit for the free tier&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="c1"&gt;// ... automate&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Consistently Enable Proxies
&lt;/h3&gt;

&lt;p&gt;CAPTCHAs are more prone to appear (and are more challenging to bypass) when requests originate from datacenter IP addresses. HyperBrowser's integrated proxies help mitigate this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;extensionIds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;extensionId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;useProxy&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Requires a paid plan — omit for the free tier. Residential proxies reduce CAPTCHA frequency&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Employ Appropriate Waiting Periods
&lt;/h3&gt;

&lt;p&gt;Different CAPTCHA types necessitate varying bypass durations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA Type&lt;/th&gt;
&lt;th&gt;Typical Bypass Time&lt;/th&gt;
&lt;th&gt;Recommended Wait&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (checkbox)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (invisible)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;td&gt;25 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;td&gt;20 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Turnstile&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;td&gt;20 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS WAF&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GeeTest v3/v4&lt;/td&gt;
&lt;td&gt;5-20 seconds&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hint&lt;/strong&gt;: When uncertain, a 30-second wait is generally advisable. It is preferable to wait slightly longer than to submit prematurely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  4. Monitor Your CapSolver Account Balance
&lt;/h3&gt;

&lt;p&gt;Each CAPTCHA bypass consumes credits. Integrate balance checks into your automation to prevent interruptions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;axios&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkBalance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;axios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://api.capsolver.com/getBalance&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;checkBalance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Low CapSolver balance! Top up at capsolver.com&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Terminate Sessions Appropriately
&lt;/h3&gt;

&lt;p&gt;Always stop sessions once their purpose is fulfilled to avoid incurring unnecessary charges:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ... your automation code&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Re-ZIP After API Key Changes
&lt;/h3&gt;

&lt;p&gt;If your CapSolver API key is rotated, you must update &lt;code&gt;config.js&lt;/code&gt;, re-zip the extension, and re-upload it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Update the key in config.js, then:&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;capsolver-extension &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; zip &lt;span class="nt"&gt;-r&lt;/span&gt; ../capsolver-extension.zip &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd&lt;/span&gt; ..
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Subsequently, upload the new ZIP file and update your stored &lt;code&gt;extensionId&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The combined capabilities of HyperBrowser and CapSolver offer the most comprehensive CAPTCHA bypassing solution available for AI browser automation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;HyperBrowser&lt;/strong&gt; manages the underlying infrastructure, including cloud sessions, proxies, anti-detection features, and native Turnstile/reCAPTCHA bypassing.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;CapSolver&lt;/strong&gt; extends this coverage to include AWS WAF, GeeTest, enterprise reCAPTCHA, and other CAPTCHA types not addressed by the native bypasser.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The integration process is straightforward: compress the CapSolver extension into a ZIP file, upload it once via the HyperBrowser SDK, and then attach it to any session. This approach eliminates the need for code-level CAPTCHA detection, token injection, or API polling, as the extension handles these aspects within the browser context.&lt;/p&gt;

&lt;p&gt;Whether you are developing web scrapers, AI agents, or automated testing pipelines, this powerful combination ensures that CAPTCHAs no longer pose a barrier, regardless of their type.&lt;/p&gt;




&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Ready to begin?&lt;/strong&gt; &lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;Sign up for CapSolver&lt;/a&gt; and use bonus code &lt;strong&gt;HYPERBROWSER&lt;/strong&gt; for an extra 6% bonus on your initial recharge!&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqun530jr4q8fah2jldv6.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqun530jr4q8fah2jldv6.png" width="526" height="234"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is HyperBrowser?
&lt;/h3&gt;

&lt;p&gt;HyperBrowser is a cloud browser infrastructure platform designed for AI agents. It provides managed, isolated browser sessions with native CDP access, enabling connection of Playwright, Puppeteer, or Selenium to cloud-hosted Chromium instances. It includes built-in proxies, anti-detection features, and native CAPTCHA bypassing for Turnstile and reCAPTCHA.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does the extension upload process work?
&lt;/h3&gt;

&lt;p&gt;HyperBrowser features a dedicated extension API. You compress your Chrome extension directory into a ZIP file, upload it using &lt;code&gt;client.extensions.create()&lt;/code&gt;, and receive an &lt;code&gt;extensionId&lt;/code&gt;. This ID is then passed to &lt;code&gt;client.sessions.create()&lt;/code&gt;, and the extension is automatically loaded into the cloud browser session.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which CAPTCHA types does CapSolver support?
&lt;/h3&gt;

&lt;p&gt;CapSolver supports reCAPTCHA v2 (both checkbox and invisible), reCAPTCHA v3, reCAPTCHA Enterprise, Cloudflare Turnstile, Cloudflare 5-second Challenge, AWS WAF, GeeTest v3/v4, among others. The Chrome extension automatically detects and bypasses the CAPTCHA type.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the cost of CapSolver?
&lt;/h3&gt;

&lt;p&gt;CapSolver offers competitive pricing structures based on CAPTCHA type and usage volume. Visit &lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=hyperbrowser"&gt;capsolver.com&lt;/a&gt; for current pricing details. Use the code &lt;strong&gt;HYPERBROWSER&lt;/strong&gt; to receive a 6% bonus on your first recharge.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is it necessary to re-upload the extension for every session?
&lt;/h3&gt;

&lt;p&gt;No. The extension needs to be uploaded only once. The returned &lt;code&gt;extensionId&lt;/code&gt; can be reused across all sessions. Re-uploading is only required if you modify the CapSolver API key within the extension or update the extension's version.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can Puppeteer be used as an alternative to Playwright?
&lt;/h3&gt;

&lt;p&gt;Yes. HyperBrowser is compatible with Playwright, Puppeteer, and Selenium. To use Puppeteer, replace the Playwright &lt;code&gt;connectOverCDP&lt;/code&gt; call with Puppeteer's equivalent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;puppeteer-core&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;browserWSEndpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;wsEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The CapSolver extension functions identically regardless of the automation framework used for connection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is HyperBrowser available for free?
&lt;/h3&gt;

&lt;p&gt;HyperBrowser provides a free tier with a limited number of sessions. Paid plans unlock additional sessions, extended timeouts, and advanced features. For current pricing and plan details, visit &lt;a href="https://www.hyperbrowser.ai" rel="noopener noreferrer"&gt;hyperbrowser.ai&lt;/a&gt;.&lt;/p&gt;




</description>
      <category>ai</category>
      <category>programming</category>
      <category>agents</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How to Bypass CAPTCHA in Vibium Without Extensions (reCAPTCHA, Turnstile, AWS WAF)</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Tue, 31 Mar 2026 08:17:44 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-to-bypass-captcha-in-vibium-without-extensions-recaptcha-turnstile-aws-waf-2n3e</link>
      <guid>https://dev.to/luisgustvo/how-to-bypass-captcha-in-vibium-without-extensions-recaptcha-turnstile-aws-waf-2n3e</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8xnv7sdni4r4wblko4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8xnv7sdni4r4wblko4b.png" alt="Bypass CAPTCHA in Vibium " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When artificial intelligence agents are employed to automate browser interactions for real-world tasks, &lt;strong&gt;CAPTCHAs&lt;/strong&gt; frequently present a significant impediment. These protective measures can block agent access to secured pages, prevent form submissions, and halt entire automated workflows, necessitating human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vibium&lt;/strong&gt; represents a new generation of browser automation tools, designed for both AI agents and human users. Utilizing the WebDriver BiDi protocol, developed by the creators of Selenium and Appium, Vibium offers agents a rapid and standardized method for browser control. However, like other automation tools, it encounters challenges when confronted with CAPTCHAs.&lt;/p&gt;

&lt;p&gt;A critical aspect to note is that &lt;strong&gt;Vibium's Go launcher hardcodes &lt;code&gt;--disable-extensions&lt;/code&gt;&lt;/strong&gt;, which means custom Chrome flags cannot be passed. Consequently, the Chrome extension-based approaches commonly used by tools such as Playwright and Puppeteer are incompatible with Vibium.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=vibium"&gt;CapSolver&lt;/a&gt;&lt;/strong&gt; addresses this limitation through an alternative methodology. Instead of relying on a browser extension, CapSolver's REST API is directly invoked to bypass the CAPTCHA. The resulting token is then injected into the web page using Vibium's JavaScript evaluation capabilities. This API-centric strategy provides comprehensive control and integrates seamlessly with Vibium's architectural design.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding Vibium
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/VibiumDev/vibium" rel="nofollow noopener noreferrer"&gt;&lt;strong&gt;Vibium&lt;/strong&gt;&lt;/a&gt; is a browser automation platform tailored for AI agents and human operators. It is distributed as a standalone Go binary, offering a zero-configuration installation, and leverages the modern WebDriver BiDi protocol for efficient, bidirectional communication with web browsers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Capabilities
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WebDriver BiDi protocol&lt;/strong&gt;: A standards-based, bidirectional communication method for browsers, distinct from the Chrome DevTools Protocol (CDP).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt;: Features an integrated Model Context Protocol server, enabling AI agents to control browsers natively.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic element identification&lt;/strong&gt;: Allows for locating web elements based on their meaning rather than solely on CSS selectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-language SDKs&lt;/strong&gt;: Provides client libraries for JavaScript/TypeScript, Python, and Java.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single Go binary&lt;/strong&gt;: Ensures zero dependencies and configuration, requiring only download and execution.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Developed by Selenium/Appium creators&lt;/strong&gt;: Benefits from extensive expertise in browser automation standards.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  AI Agent Application
&lt;/h3&gt;

&lt;p&gt;Vibium's MCP server facilitates AI agents in issuing browser commands through a standardized protocol. Agents can perform actions such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Navigating to URLs and interacting with page elements.&lt;/li&gt;
&lt;li&gt;Semantically identifying elements (e.g., "the login button" instead of &lt;code&gt;#btn-login&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Executing arbitrary JavaScript on the page via &lt;code&gt;browser_evaluate&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Completing forms, clicking buttons, and extracting content.&lt;/li&gt;
&lt;li&gt;Managing multiple browser sessions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This functionality essentially provides AI agents with a browser interface that can be controlled using natural language commands.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding CapSolver
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=vibium"&gt;CapSolver&lt;/a&gt; is a prominent CAPTCHA bypassing service that offers AI-driven solutions for overcoming various CAPTCHA challenges. With support for numerous CAPTCHA types and rapid response times, CapSolver integrates effectively into automated workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported CAPTCHA Categories
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.capsolver.com/products/recaptchav2" rel="noopener noreferrer"&gt;&lt;strong&gt;reCAPTCHA v2&lt;/strong&gt;&lt;/a&gt; (both image-based and invisible variants)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/recaptchav3" rel="noopener noreferrer"&gt;&lt;strong&gt;reCAPTCHA v3 &amp;amp; v3 Enterprise&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/cloudflare" rel="noopener noreferrer"&gt;&lt;strong&gt;Cloudflare Turnstile&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.capsolver.com/en/guide/captcha/cloudflare_challenge/" rel="noopener noreferrer"&gt;&lt;strong&gt;Cloudflare 5-second Challenge&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.capsolver.com/products/awswaf" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS WAF CAPTCHA&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.capsolver.com/en/guide/api-server/" rel="noopener noreferrer"&gt;&lt;strong&gt;Other widely utilized CAPTCHA and anti-bot mechanisms&lt;/strong&gt;&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Distinctive Integration Approach
&lt;/h2&gt;

&lt;p&gt;Most browser automation tools, including Playwright, Puppeteer, OpenClaw, and NanoClaw, typically bypass CAPTCHAs by directly loading the CapSolver Chrome extension into the browser. This extension automatically detects CAPTCHAs, bypasses them in the background, and injects tokens without visible interaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vibium, however, cannot employ this method.&lt;/strong&gt; Its Go launcher explicitly hardcodes &lt;code&gt;--disable-extensions&lt;/code&gt; when launching Chrome, precluding any configuration or workaround for loading extensions.&lt;/p&gt;

&lt;p&gt;Instead, this integration directly utilizes the &lt;strong&gt;CapSolver REST API&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Extension-Based Approach (e.g., Playwright)&lt;/th&gt;
&lt;th&gt;API-Based Approach (Vibium)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Extension autonomously detects and bypasses CAPTCHAs&lt;/td&gt;
&lt;td&gt;Your code initiates API calls, retrieves a token, and injects it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Extension Requirement&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes (Chrome extension loaded via &lt;code&gt;--load-extension&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;No (relies purely on HTTP API calls)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent Awareness&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Agent operates without explicit knowledge of CAPTCHA handling&lt;/td&gt;
&lt;td&gt;Agent or script actively manages the bypassing process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Chrome Flags&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires &lt;code&gt;--load-extension&lt;/code&gt; support&lt;/td&gt;
&lt;td&gt;Compatible with any Chrome flags, including &lt;code&gt;--disable-extensions&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Control Level&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Automated, opaque&lt;/td&gt;
&lt;td&gt;Explicit, offering granular control over each step&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limited to extension's predefined capabilities&lt;/td&gt;
&lt;td&gt;Allows customization of detection, retry logic, and token injection per site&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Optimal Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tools that permit custom Chrome arguments&lt;/td&gt;
&lt;td&gt;Tools like Vibium that impose restrictions on Chrome arguments&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key takeaway&lt;/strong&gt;: The API-based approach offers enhanced capabilities. It provides control over when to detect, when to bypass, and precisely how to inject the token. This method is compatible with any browser automation tool, irrespective of its Chrome flag limitations.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;Before configuring this integration, ensure the following are in place:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Vibium&lt;/strong&gt; is installed (&lt;a href="https://github.com/VibiumDev/vibium" rel="noopener noreferrer"&gt;download from GitHub&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A CapSolver account&lt;/strong&gt; with an active API key (&lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=vibium"&gt;sign up here&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One of the following environments&lt;/strong&gt;: Node.js 18+ / Python 3.8+ / Java 17+&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Vibium Installation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For macOS / Linux — single binary, no dependencies&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://vibium.dev/install.sh | bash

&lt;span class="c"&gt;# Alternatively, download directly from GitHub releases&lt;/span&gt;
&lt;span class="c"&gt;# https://github.com/VibiumDev/vibium/releases&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Verify the installation by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vibium &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  No Dedicated Chrome Installation Required
&lt;/h3&gt;

&lt;p&gt;Vibium independently manages its browser lifecycle. There is no need to install Chrome for Testing, Playwright's bundled Chromium, or any specific browser variant. Vibium handles the internal download and management of browsers.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Configuration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Obtain Your CapSolver API Key
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Register at &lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=vibium"&gt;capsolver.com&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Access your dashboard&lt;/li&gt;
&lt;li&gt;Copy your API key (it typically begins with &lt;code&gt;CAP-&lt;/code&gt;)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Set this key as an environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Install the Vibium SDK and HTTP Client
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;vibium
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vibium requests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Java (Gradle):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight groovy"&gt;&lt;code&gt;&lt;span class="n"&gt;implementation&lt;/span&gt; &lt;span class="s1"&gt;'com.vibium:vibium:26.3.18'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Develop a CAPTCHA Detection Utility
&lt;/h3&gt;

&lt;p&gt;Prior to bypassing a CAPTCHA, it is necessary to identify its type and extract the site key. This can be achieved by inspecting the page using Vibium's &lt;code&gt;browser_evaluate&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;The JavaScript code for detection remains consistent across all three programming languages; only the host call varies:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vibium/sync&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`(() =&amp;gt; {
    const v2 = document.querySelector('.g-recaptcha');
    if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

    for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
      const m = s.src.match(/render=([^&amp;amp;]+)/);
      if (m &amp;amp;&amp;amp; m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
    }

    const t = document.querySelector('.cf-turnstile');
    if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

    return { type: 'none', siteKey: null };
  })()`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vibium&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;(() =&amp;gt; {
        const v2 = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.g-recaptcha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        if (v2) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: v2.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) };

        for (const s of document.querySelectorAll(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;script[src*=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recaptcha/api.js&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)) {
            const m = s.src.match(/render=([^&amp;amp;]+)/);
            if (m &amp;amp;&amp;amp; m[1] !== &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;explicit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: m[1] };
        }

        const t = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.cf-turnstile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        if (t) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;turnstile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: t.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) };

        return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: null };
    })()&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Java:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
    (() =&amp;gt; {
        const v2 = document.querySelector('.g-recaptcha');
        if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

        for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
            const m = s.src.match(/render=([^&amp;amp;]+)/);
            if (m &amp;amp;&amp;amp; m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
        }

        const t = document.querySelector('.cf-turnstile');
        if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

        return { type: 'none', siteKey: null };
    })()
    """&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;captchaType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;siteKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"siteKey"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Implement the CAPTCHA Bypassing Function
&lt;/h3&gt;

&lt;p&gt;Initiate a task with the CapSolver API, then continuously query for the outcome.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.capsolver.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API_KEY&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/createTask`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;taskData&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`CapSolver: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorDescription&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/getTaskResult`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;taskId&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ready&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Failed: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorDescription&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CapSolver: Task timed out&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;taskType&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;switch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ReCaptchaV2TaskProxyLess&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recaptcha-v3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ReCaptchaV3TaskProxyLess&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;turnstile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;AntiTurnstileTaskProxyLess&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unsupported CAPTCHA type: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;taskId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createTask&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;taskType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;websiteURL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;websiteKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;siteKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;gRecaptchaResponse&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;solution&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Example Usage (JavaScript)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;page&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="c1"&gt;// 1. Navigate&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://www.google.com/recaptcha/api2/demo&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 2. Detect&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;none&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No CAPTCHA detected.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Detected &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; — key &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;siteKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 3. Bypass&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Bypassed!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 4. Inject + submit&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`
    document.querySelector('textarea[name="g-recaptcha-response"]').value = "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;";
    try { const c = ___grecaptcha_cfg.clients; for (const id in c) {
      const f = (o) =&amp;gt; { for (const k in o) { if (typeof o[k]==='object'&amp;amp;&amp;amp;o[k]!==null) {
        if (typeof o[k].callback==='function'){o[k].callback("&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;");return true}
        if(f(o[k]))return true}} return false}; f(c[id]) }} catch(e){}
  `&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`document.querySelector('#recaptcha-demo-form').submit()`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// 5. Verify&lt;/span&gt;
  &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Result:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;document.body.innerText&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vibium&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.capsolver.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/createTask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_data&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errorId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CapSolver: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errorDescription&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;taskId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_task_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_attempts&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/getTaskResult&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;taskId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ready&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;errorDescription&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;CapSolver: Task timed out&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bypass_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;task_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;task_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ReCaptchaV2TaskProxyLess&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;task_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ReCaptchaV3TaskProxyLess&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;turnstile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;task_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;AntiTurnstileTaskProxyLess&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Unsupported CAPTCHA type: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;task_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;websiteURL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;websiteKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;siteKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_task_result&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gRecaptchaResponse&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;solution&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;token&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;bro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 1. Navigate
&lt;/span&gt;    &lt;span class="n"&gt;target_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.google.com/recaptcha/api2/demo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Detect
&lt;/span&gt;    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;(() =&amp;gt; {
        const el = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.g-recaptcha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        return el ? { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: el.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) }
                   : { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: null };
    })()&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No CAPTCHA detected.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — key &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;siteKey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Bypass
&lt;/span&gt;    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bypass_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bypassed!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Inject + submit
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[name=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;g-recaptcha-response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).value = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;
        try {{ const c = ___grecaptcha_cfg.clients; for (const id in c) {{
            const f = (o) =&amp;gt; {{ for (const k in o) {{ if (typeof o[k]===&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;&amp;amp;o[k]!==null) {{
                if (typeof o[k].callback===&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;){{o[k].callback(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;);return true}}
                if(f(o[k]))return true}}}} return false}}; f(c[id]) }}}} catch(e){{}}
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#recaptcha-demo-form&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).submit()&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Verify
&lt;/span&gt;    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;document.body.innerText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Java:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;com.vibium.Vibium&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.json.JSONObject&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.URI&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpClient&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpRequest&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.net.http.HttpResponse&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.Map&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CapSolverIntegration&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.capsolver.com"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="no"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getenv&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CAPSOLVER_API_KEY"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;createTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="n"&gt;taskData&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newHttpClient&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"/createTask"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;header&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Content-Type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;POST&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyPublishers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"clientKey"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;API_KEY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                        &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"task"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;taskData&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getInt&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"errorId"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Exception&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CapSolver: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"errorDescription"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"taskId"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;maxAttempts&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;HttpClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newHttpClient&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;maxAttempts&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;HttpRequest&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;newBuilder&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;URI&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;"/getTaskResult"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;header&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Content-Type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"application/json"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;POST&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;HttpRequest&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyPublishers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"clientKey"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;API_KEY&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"taskId"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="o"&gt;()))&lt;/span&gt;
                    &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;send&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;HttpResponse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;BodyHandlers&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ofString&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"ready"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"status"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"failed"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"errorDescription"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Exception&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CapSolver: Task timed out"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="nf"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;taskType&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"recaptcha-v2"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ReCaptchaV2TaskProxyLess"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"recaptcha-v3"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"ReCaptchaV3TaskProxyLess"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="s"&gt;"turnstile"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;taskType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"AntiTurnstileTaskProxyLess"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Exception&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Unsupported CAPTCHA type: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="n"&gt;taskData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;JSONObject&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;taskType&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"websiteURL"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
                &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"websiteKey"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"siteKey"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;taskId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;createTask&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;taskData&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;JSONObject&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;getTaskResult&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getJSONObject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"solution"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;optString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gRecaptchaResponse"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getJSONObject&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"solution"&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;getString&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"token"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;bro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Vibium&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;page&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;

        &lt;span class="c1"&gt;// 1. Navigate&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;targetUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://www.google.com/recaptcha/api2/demo"&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;go&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;targetUrl&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// 2. Detect&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Object&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;)&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
            (() =&amp;gt; {
                const el = document.querySelector('.g-recaptcha');
                return el ? { type: 'recaptcha-v2', siteKey: el.getAttribute('data-sitekey') }
                           : { type: 'none', siteKey: null };
            })()"""&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"none"&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"No CAPTCHA detected."&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;printf&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Detected %s — key %s%n"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;),&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"siteKey"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// 3. Bypass&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;targetUrl&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Bypassed!"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// 4. Inject + submit&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
            document.querySelector('textarea[name="g-recaptcha-response"]').value = "%s";
            try { const c = ___grecaptcha_cfg.clients; for (const id in c) {
                const f = (o) =&amp;gt; { for (const k in o) { if (typeof o[k]==='object'&amp;amp;&amp;amp;o[k]!==null) {
                    if (typeof o[k].callback==='function'){o[k].callback("%s");return true}
                    if(f(o[k]))return true}}}} return false}; f(c[id]) }}}} catch(e){}
            """&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"document.querySelector('#recaptcha-demo-form').submit()"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// 5. Verify&lt;/span&gt;
        &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;println&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Result: "&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"document.body.innerText"&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Supported CAPTCHA Task Categories
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA Type&lt;/th&gt;
&lt;th&gt;CapSolver Task Type&lt;/th&gt;
&lt;th&gt;Token Field&lt;/th&gt;
&lt;th&gt;Estimated Bypass Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;textarea[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (invisible)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;textarea[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV3TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;input[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA Enterprise&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2EnterpriseTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;textarea[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;10-20 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Turnstile&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiTurnstileTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;input[name="cf-turnstile-response"]&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS WAF&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiAwsWafTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Custom (site-dependent)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GeeTest v3/v4&lt;/td&gt;
&lt;td&gt;&lt;code&gt;GeeTestTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Custom (site-dependent)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Troubleshooting Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Token Expiration Before Form Submission
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: The form is submitted, but the server rejects the CAPTCHA response.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cause&lt;/strong&gt;: CAPTCHA tokens possess a limited validity period (typically 90-120 seconds for reCAPTCHA, 300 seconds for Turnstile). If there is an excessive delay between bypassing the CAPTCHA and submitting the form, the token may expire.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resolution&lt;/strong&gt;: Inject and submit the token immediately upon receipt. Avoid introducing unnecessary delays between the bypassing and submission steps.&lt;/p&gt;

&lt;h3&gt;
  
  
  CAPTCHA Not Detected on Page
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: The detection script reports &lt;code&gt;{ type: 'none' }&lt;/code&gt; even when a CAPTCHA is visibly present.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Potential Causes:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Incomplete page loading&lt;/strong&gt; — Introduce a waiting period after navigation (e.g., &lt;code&gt;time.sleep(3)&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CAPTCHA within an iframe&lt;/strong&gt; — Some reCAPTCHA implementations load inside an iframe. It may be necessary to detect the iframe and extract the site key from the page source or network requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic loading&lt;/strong&gt; — The CAPTCHA widget might load asynchronously. Wait for the element to appear before attempting detection.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  CapSolver API Errors
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Common Error Scenarios:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Error Code&lt;/th&gt;
&lt;th&gt;Underlying Cause&lt;/th&gt;
&lt;th&gt;Corrective Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR_KEY_DOES_NOT_EXIST&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Invalid API key provided&lt;/td&gt;
&lt;td&gt;Verify your &lt;code&gt;CAPSOLVER_API_KEY&lt;/code&gt; setting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR_ZERO_BALANCE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Insufficient credits in your account&lt;/td&gt;
&lt;td&gt;Recharge your account at &lt;a href="https://www.capsolver.com/?dev.to_source=official&amp;amp;dev.to_medium=blog&amp;amp;dev.to_campaign=vibium"&gt;capsolver.com&lt;/a&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR_WRONG_CAPTCHA_TYPE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Incorrect task type specified for the CAPTCHA&lt;/td&gt;
&lt;td&gt;Confirm the CAPTCHA type using the detection utility&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ERROR_CAPTCHA_UNSOLVABLE&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;The CAPTCHA could not be bypassed&lt;/td&gt;
&lt;td&gt;Attempt a retry, as transient failures can occur&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  CORS Issues During CapSolver API Calls
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: API requests originating from the browser fail due to Cross-Origin Resource Sharing (CORS) policies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cause&lt;/strong&gt;: This occurs when attempting to invoke the CapSolver API from within &lt;code&gt;browser_evaluate&lt;/code&gt; (i.e., from the browser's context). The CapSolver API does not permit cross-origin requests from arbitrary websites.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resolution&lt;/strong&gt;: Always make CapSolver API calls from your &lt;strong&gt;script's environment&lt;/strong&gt; (Node.js, Python, or Java process), not from within the browser. &lt;code&gt;browser_evaluate&lt;/code&gt; should be reserved for detection (reading the DOM) and injection (setting form values). API interactions must be handled server-side.&lt;/p&gt;

&lt;h3&gt;
  
  
  Form Submission Failure
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: The token is injected, but the form either fails to submit or the server does not accept it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Potential Causes:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Missing callback trigger&lt;/strong&gt; — Many reCAPTCHA implementations require the callback function to be invoked with the token, not merely setting the textarea value. Refer to the &lt;code&gt;injectToken&lt;/code&gt; function example above, which traverses &lt;code&gt;___grecaptcha_cfg.clients&lt;/code&gt; to locate and trigger the callback.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom form validation&lt;/strong&gt; — The website may incorporate additional JavaScript validation. Inspect the form's submit handler in developer tools.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token format discrepancy&lt;/strong&gt; — Ensure that &lt;code&gt;gRecaptchaResponse&lt;/code&gt; is used for reCAPTCHA and &lt;code&gt;token&lt;/code&gt; for Turnstile, as provided by the CapSolver result.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Implement a Sensible Polling Interval
&lt;/h3&gt;

&lt;p&gt;Query &lt;code&gt;/getTaskResult&lt;/code&gt; every &lt;strong&gt;2 seconds&lt;/strong&gt;. More frequent polling can lead to wasted API calls and potential rate limiting. Less frequent polling introduces unnecessary latency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// JavaScript: Optimal — 2-second interval&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python: Optimal — 2-second interval
&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Java: Optimal — 2-second interval&lt;/span&gt;
&lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Incorporate Retry Logic with Exponential Backoff
&lt;/h3&gt;

&lt;p&gt;CAPTCHA bypassing can occasionally encounter failures. Encapsulate your bypassing function with retry mechanisms:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;bypassWithRetry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bypass_with_retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;bypass_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;raise&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Utilize the Appropriate Task Type for Each CAPTCHA
&lt;/h3&gt;

&lt;p&gt;Employing an incorrect task type will result in bypassing failure. Always detect the CAPTCHA type initially, then map it to the corresponding CapSolver task:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA Type&lt;/th&gt;
&lt;th&gt;CapSolver Task Type&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (checkbox)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (invisible)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV3TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 Enterprise&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2EnterpriseTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3 Enterprise&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV3EnterpriseTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Turnstile&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiTurnstileTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AWS WAF&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiAwsWafTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  4. Immediate Injection and Submission
&lt;/h3&gt;

&lt;p&gt;CAPTCHA tokens have a limited lifespan. Once the token is received from CapSolver, inject it and submit the form as swiftly as possible. Avoid introducing artificial delays between the bypassing and submission phases.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Monitor Balance Before Extended Operations
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/getBalance`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Low CapSolver balance!&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/getBalance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clientKey&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;balance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;balance&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Low CapSolver balance!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Maintain Server-Side API Calls
&lt;/h3&gt;

&lt;p&gt;Never invoke the CapSolver API from within &lt;code&gt;browser_evaluate&lt;/code&gt;. HTTP requests made from the browser context will fail due to CORS restrictions, and exposing your API key in browser-side JavaScript poses a security risk. All API calls must originate from your application's process (Node.js, Python, or Java).&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The integration of Vibium with the CapSolver API demonstrates that browser extensions are not a prerequisite for bypassing CAPTCHAs in automated workflows. When a tool like Vibium imposes restrictions on Chrome flags, the API-based approach offers &lt;strong&gt;enhanced control, rather than diminished capabilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Detect&lt;/strong&gt; the CAPTCHA type and site key using &lt;code&gt;browser_evaluate&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bypass&lt;/strong&gt; the CAPTCHA by invoking the CapSolver REST API from your script.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inject&lt;/strong&gt; the obtained token back into the page via &lt;code&gt;browser_evaluate&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Submit&lt;/strong&gt; the form.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This methodology is applicable to any browser automation tool that supports JavaScript evaluation, extending beyond just Vibium. Regardless of whether you are utilizing WebDriver BiDi, CDP, or another protocol, the CapSolver API approach provides a universal solution.&lt;/p&gt;

&lt;p&gt;By combining Vibium's standards-compliant browser automation with CapSolver's efficient and dependable CAPTCHA bypassing API, a robust pipeline is established for seamless automated operations.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>browser</category>
      <category>cloudflarechallenge</category>
    </item>
    <item>
      <title>How to Bypass CAPTCHAs in Vibium: A Complete Guide for AI Agents</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Fri, 27 Mar 2026 07:30:45 +0000</pubDate>
      <link>https://dev.to/luisgustvo/how-to-bypass-captchas-in-vibium-a-complete-guide-for-ai-agents-6ki</link>
      <guid>https://dev.to/luisgustvo/how-to-bypass-captchas-in-vibium-a-complete-guide-for-ai-agents-6ki</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8xnv7sdni4r4wblko4b.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fz8xnv7sdni4r4wblko4b.png" alt="Bypass CAPTCHA in Vibium" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the world of &lt;strong&gt;AI browser automation&lt;/strong&gt;, CAPTCHAs remain the most significant hurdle. When AI agents attempt to navigate protected pages or submit forms, these security measures often stall workflows, requiring manual human intervention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vibium&lt;/strong&gt; has emerged as a powerful, next-generation automation tool designed specifically for AI agents. Built on the modern &lt;strong&gt;WebDriver BiDi protocol&lt;/strong&gt; by the creators of Selenium and Appium, it offers a high-performance, standards-based way to control browsers. However, Vibium presents a unique challenge: it hardcodes the &lt;code&gt;--disable-extensions&lt;/code&gt; flag, meaning traditional browser extension-based CAPTCHA bypassers won't work.&lt;/p&gt;

&lt;p&gt;This is where &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=vibium"&gt;CapSolver&lt;/a&gt; comes in. By utilizing the &lt;strong&gt;CapSolver REST API&lt;/strong&gt;, you can bypass CAPTCHAs programmatically without needing any browser extensions. This guide will show you how to integrate CapSolver with Vibium to create seamless, automated workflows for your AI agents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Understanding Vibium
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/VibiumDev/vibium" rel="noopener noreferrer"&gt;Vibium&lt;/a&gt; is a streamlined browser automation platform. It is distributed as a single Go binary, making it incredibly easy to install and deploy. Unlike older tools that rely on the Chrome DevTools Protocol (CDP), Vibium leverages the &lt;strong&gt;WebDriver BiDi protocol&lt;/strong&gt; for faster, bidirectional communication.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Advantages of Vibium
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;WebDriver BiDi Support&lt;/strong&gt;: Provides a standardized, high-speed connection to the browser.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Native AI Integration&lt;/strong&gt;: Includes a built-in &lt;strong&gt;MCP (Model Context Protocol) server&lt;/strong&gt;, allowing AI agents to control the browser directly.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Semantic Interaction&lt;/strong&gt;: Agents can find elements based on their meaning (e.g., "the checkout button") rather than brittle CSS selectors.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Cross-Language SDKs&lt;/strong&gt;: Official support for Python, JavaScript/TypeScript, and Java.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Zero-Config Setup&lt;/strong&gt;: A single binary with no external dependencies.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For AI agents, Vibium acts as a bridge, allowing them to interact with the web using natural language commands while maintaining the precision of a programmatic API.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is CapSolver?
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=vibium"&gt;CapSolver&lt;/a&gt; is an industry-leading &lt;strong&gt;CAPTCHA bypassing service&lt;/strong&gt; powered by advanced AI. It provides automated solutions for a wide variety of anti-bot challenges, ensuring your automation scripts remain uninterrupted.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported CAPTCHA Solutions
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;a href="https://www.capsolver.com/products/recaptchav2" rel="noopener noreferrer"&gt;&lt;strong&gt;reCAPTCHA v2 &amp;amp; v3&lt;/strong&gt;&lt;/a&gt; (including Enterprise versions)&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://www.capsolver.com/products/cloudflare" rel="noopener noreferrer"&gt;&lt;strong&gt;Cloudflare Turnstile&lt;/strong&gt;&lt;/a&gt; &amp;amp; 5-second Challenges&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://www.capsolver.com/products/awswaf" rel="noopener noreferrer"&gt;&lt;strong&gt;AWS WAF CAPTCHA&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;a href="https://docs.capsolver.com/en/guide/captcha/geetest/" rel="noopener noreferrer"&gt;&lt;strong&gt;GeeTest v3/v4&lt;/strong&gt;&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;And many other anti-bot mechanisms.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why the API-Based Approach is Superior for Vibium
&lt;/h2&gt;

&lt;p&gt;Most automation frameworks like Playwright or Puppeteer bypass CAPTCHAs by loading a Chrome extension. Since Vibium disables extensions by default, we use the &lt;strong&gt;CapSolver API&lt;/strong&gt; approach. This method is actually more robust and offers greater control.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Extension-Based (Playwright/Puppeteer)&lt;/th&gt;
&lt;th&gt;API-Based (Vibium + CapSolver)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Mechanism&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Automatic detection via extension&lt;/td&gt;
&lt;td&gt;Explicit API calls and token injection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Extension Required&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;No&lt;/strong&gt; (Pure HTTP)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent Control&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Opaque/Automatic&lt;/td&gt;
&lt;td&gt;Full programmatic control&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compatibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limited by browser flags&lt;/td&gt;
&lt;td&gt;Works with any configuration&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fixed logic&lt;/td&gt;
&lt;td&gt;Customizable retry and injection logic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By using the API, you can precisely manage when a CAPTCHA is bypassed and how the resulting token is submitted, making it the ideal choice for restricted environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Prerequisites
&lt;/h2&gt;

&lt;p&gt;To get started, ensure you have the following:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Vibium Installed&lt;/strong&gt;: Get it from the &lt;a href="https://github.com/VibiumDev/vibium" rel="noopener noreferrer"&gt;official GitHub repository&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;CapSolver Account&lt;/strong&gt;: &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=vibium"&gt;Sign up here&lt;/a&gt; to get your API key.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Development Environment&lt;/strong&gt;: Node.js 18+, Python 3.8+, or Java 17+.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Installing Vibium
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Quick install for macOS / Linux&lt;/span&gt;
curl &lt;span class="nt"&gt;-fsSL&lt;/span&gt; https://vibium.dev/install.sh | bash

&lt;span class="c"&gt;# Verify installation&lt;/span&gt;
vibium &lt;span class="nt"&gt;--version&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Vibium manages its own browser instances, so you don't need to worry about installing specific versions of Chromium or Chrome for Testing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step-by-Step Integration Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Configure Your API Key
&lt;/h3&gt;

&lt;p&gt;Sign up at &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=vibium"&gt;CapSolver&lt;/a&gt; and retrieve your API key from the dashboard. Set it as an environment variable:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"CAP-YOUR_ACTUAL_API_KEY"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Install Dependencies
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;For Node.js:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;vibium
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;For Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vibium requests
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Detect CAPTCHAs on the Page
&lt;/h3&gt;

&lt;p&gt;Use Vibium's &lt;code&gt;browser_evaluate&lt;/code&gt; to inspect the DOM and identify the CAPTCHA type and site key.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vibium/sync&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`(() =&amp;gt; {
    const v2 = document.querySelector('.g-recaptcha');
    if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

    for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
      const m = s.src.match(/render=([^&amp;amp;]+)/);
      if (m &amp;amp;&amp;amp; m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
    }

    const t = document.querySelector('.cf-turnstile');
    if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

    return { type: 'none', siteKey: null };
  })()`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Python Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vibium&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;detect_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;(() =&amp;gt; {
        const v2 = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.g-recaptcha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        if (v2) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: v2.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) };

        for (const s of document.querySelectorAll(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;script[src*=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;recaptcha/api.js&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)) {
            const m = s.src.match(/render=([^&amp;amp;]+)/);
            if (m &amp;amp;&amp;amp; m[1] !== &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;explicit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v3&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: m[1] };
        }

        const t = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.cf-turnstile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        if (t) return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;turnstile&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: t.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) };

        return { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: null };
    })()&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Java Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;evaluate&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""
    (() =&amp;gt; {
        const v2 = document.querySelector('.g-recaptcha');
        if (v2) return { type: 'recaptcha-v2', siteKey: v2.getAttribute('data-sitekey') };

        for (const s of document.querySelectorAll('script[src*="recaptcha/api.js"]')) {
            const m = s.src.match(/render=([^&amp;amp;]+)/);
            if (m &amp;amp;&amp;amp; m[1] !== 'explicit') return { type: 'recaptcha-v3', siteKey: m[1] };
        }

        const t = document.querySelector('.cf-turnstile');
        if (t) return { type: 'turnstile', siteKey: t.getAttribute('data-sitekey') };

        return { type: 'none', siteKey: null };
    })()
    """&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;captchaType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"type"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;siteKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;((&lt;/span&gt;&lt;span class="nc"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"siteKey"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Bypass and Inject the Token
&lt;/h3&gt;

&lt;p&gt;Once detected, call the CapSolver API to bypass the challenge and inject the resulting token back into the page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;JavaScript Implementation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.capsolver.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API_KEY&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createTask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/createTask`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;taskData&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorId&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`CapSolver: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorDescription&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getTaskResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;maxAttempts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;CAPSOLVER_API&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/getTaskResult`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;clientKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;taskId&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ready&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Failed: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errorDescription&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Timeout&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Full Workflow (Python):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vibium&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;CAPSOLVER_API&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.capsolver.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CAPSOLVER_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;bro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;page&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# 1. Navigate to the target page
&lt;/span&gt;    &lt;span class="n"&gt;target_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://example.com/protected-page&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;go&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. Detect the CAPTCHA
&lt;/span&gt;    &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;(() =&amp;gt; {
        const el = document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;.g-recaptcha&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;);
        return el ? { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;recaptcha-v2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: el.getAttribute(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;data-sitekey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;) }
                   : { type: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, siteKey: null };
    })()&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;none&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;No CAPTCHA found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Detected &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — key &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;siteKey&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 3. Bypass via CapSolver API
&lt;/span&gt;    &lt;span class="c1"&gt;# (Assuming solve_captcha helper is implemented)
&lt;/span&gt;    &lt;span class="n"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;solve_captcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Solved!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 4. Inject the token and submit the form
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;textarea[name=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;g-recaptcha-response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;).value = &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;;
        try {{ const c = ___grecaptcha_cfg.clients; for (const id in c) {{
            const f = (o) =&amp;gt; {{ for (const k in o) {{ if (typeof o[k]===&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;object&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;&amp;amp;o[k]!==null) {{
                if (typeof o[k].callback===&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;function&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;){{o[k].callback(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;token&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;);return true}}
                if(f(o[k]))return true}}}} return false}}; f(c[id]) }}}} catch(e){{}}
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;document.querySelector(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#recaptcha-demo-form&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;).submit()&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# 5. Verify success
&lt;/span&gt;    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Result:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;document.body.innerText&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;bro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Supported CAPTCHA Task Types
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA Type&lt;/th&gt;
&lt;th&gt;CapSolver Task Type&lt;/th&gt;
&lt;th&gt;Token Injection Field&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;reCAPTCHA v2&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV2TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;textarea[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;reCAPTCHA v3&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;ReCaptchaV3TaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;input[name="g-recaptcha-response"]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cloudflare Turnstile&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiTurnstileTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;input[name="cf-turnstile-response"]&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AWS WAF&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AntiAwsWafTaskProxyLess&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Site-specific&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Troubleshooting &amp;amp; Best Practices
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Common Issues
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;  &lt;strong&gt;Token Expiration&lt;/strong&gt;: CAPTCHA tokens usually expire within 2 minutes. Ensure you inject and submit the form immediately after receiving the token.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;CORS Errors&lt;/strong&gt;: Never call the CapSolver API from within &lt;code&gt;browser_evaluate&lt;/code&gt;. Always make API calls from your main script (Node/Python/Java) to avoid security and cross-origin issues.&lt;/li&gt;
&lt;li&gt;  &lt;strong&gt;Callback Functions&lt;/strong&gt;: Many sites use JavaScript callbacks to handle CAPTCHA submission. Use the injection script provided above to find and trigger these callbacks automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Best Practices for High Reliability
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Polling Interval&lt;/strong&gt;: Poll the CapSolver API every &lt;strong&gt;2 seconds&lt;/strong&gt;. This is the optimal balance between speed and efficiency.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Retry Logic&lt;/strong&gt;: Implement exponential backoff for your API calls to handle transient network failures.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Balance Monitoring&lt;/strong&gt;: Check your CapSolver balance programmatically before starting large automation runs to avoid interruptions.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Integrating &lt;strong&gt;Vibium&lt;/strong&gt; with the &lt;strong&gt;CapSolver API&lt;/strong&gt; provides a robust, future-proof solution for bypassing CAPTCHAs in AI-driven browser automation. While Vibium's restriction on extensions might seem like a limitation, the API-based approach offers superior control and flexibility.&lt;/p&gt;

&lt;p&gt;By following this guide, you can ensure your AI agents navigate the web smoothly, overcoming security obstacles with ease. Ready to scale your automation? &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=blog&amp;amp;utm_campaign=vibium"&gt;Sign up for CapSolver&lt;/a&gt; today and start bypassing!&lt;/p&gt;

</description>
      <category>captcha</category>
      <category>ai</category>
      <category>agents</category>
      <category>browser</category>
    </item>
    <item>
      <title>Solving CAPTCHA Challenges with Vercel Agent Browser: A CapSolver Integration Guide</title>
      <dc:creator>luisgustvo</dc:creator>
      <pubDate>Mon, 23 Mar 2026 10:16:18 +0000</pubDate>
      <link>https://dev.to/luisgustvo/solving-captcha-challenges-with-vercel-agent-browser-a-capsolver-integration-guide-4l21</link>
      <guid>https://dev.to/luisgustvo/solving-captcha-challenges-with-vercel-agent-browser-a-capsolver-integration-guide-4l21</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yrckkrgt4w8u2jfff5w.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yrckkrgt4w8u2jfff5w.webp" alt="Solve CAPTCHA with Vercel Agent Browser " width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;When an AI agent encounters a CAPTCHA, the automated workflow is disrupted. Navigation halts, form submissions fail, and data extraction becomes impossible, all due to security measures designed to prevent automated access. Vercel Agent Browser, a high-performance, native Rust CLI, is specifically engineered for headless browser automation in AI agent contexts. It offers features like accessibility-first element selection, semantic locators, and an LLM-optimized snapshot-ref workflow. However, like any browser automation tool, it can be impeded by CAPTCHAs.&lt;/p&gt;

&lt;p&gt;CapSolver offers a transformative solution. By integrating the CapSolver Chrome extension into Agent Browser via the &lt;code&gt;--extension&lt;/code&gt; flag, CAPTCHAs are automatically and seamlessly resolved in the background. This eliminates the need for manual intervention or complex API orchestrations. Your command-line operations continue uninterrupted, as if no CAPTCHA ever appeared.&lt;/p&gt;

&lt;p&gt;A significant advantage is Agent Browser's support for extensions in &lt;strong&gt;both headed and headless modes&lt;/strong&gt;, a capability not shared by tools like Playwright, which typically require headed mode for extensions. This ensures that your production pipelines, CI/CD workflows, and serverless deployments can operate without any display requirements. Your agent can then concentrate on its core functions—navigating web pages, extracting data, and automating tasks—while CapSolver efficiently manages CAPTCHA resolution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Introduction to Vercel Agent Browser
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/vercel-labs/agent-browser" rel="noopener noreferrer"&gt;Vercel Agent Browser&lt;/a&gt; is a headless browser automation command-line interface developed in Rust for superior performance. Created by Vercel Labs, it provides a CLI to control Chrome without relying on Playwright or Node.js for the browser daemon. Its design prioritizes accessibility, utilizing semantic locators and snapshot references, making it an ideal tool for AI agents interacting with web content.&lt;/p&gt;

&lt;h3&gt;
  
  
  Core Capabilities
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Native Rust CLI&lt;/strong&gt;: A rapid, single-binary tool with no runtime dependencies for the browser daemon.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot-Ref Workflow&lt;/strong&gt;: Generates an accessibility tree with element references, enabling deterministic, fast, and AI-friendly interactions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Semantic Locators&lt;/strong&gt;: Facilitates element identification using ARIA roles, text content, labels, placeholders, or alt text, avoiding fragile CSS selectors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Headless Extension Support&lt;/strong&gt;: Allows loading Chrome extensions in both headed and headless modes, leveraging Chrome's &lt;code&gt;--headless=new&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session Management&lt;/strong&gt;: Provides isolated sessions, persistent profiles, encrypted state storage, and an authentication vault for credential handling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON Output Mode&lt;/strong&gt;: Delivers machine-readable output for agent pipelines when using &lt;code&gt;--json&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud Provider Integration&lt;/strong&gt;: Includes built-in support for services such as Browserless, Browserbase, Browser Use, Kernel, and iOS Simulator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security Features&lt;/strong&gt;: Incorporates domain allowlists, action policies, content boundaries, and confirmation gates to ensure secure AI agent deployments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Agent Browser functions effectively across various web environments, including authenticated content, dynamic Single-Page Applications (SPAs), and CAPTCHA-protected sites, making it highly suitable for AI agent workflows, data collection, and automated testing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding CapSolver
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dashboard.capsolver.com/dashboard/overview/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agent-browser-capsolver"&gt;CapSolver&lt;/a&gt; is a prominent AI-driven CAPTCHA solving service designed to automatically overcome a wide array of CAPTCHA challenges. Known for its rapid response times and extensive compatibility, CapSolver integrates smoothly into automated processes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Supported CAPTCHA Categories
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;reCAPTCHA v2 (both checkbox and invisible variants)&lt;/li&gt;
&lt;li&gt;reCAPTCHA v3 &amp;amp; v3 Enterprise&lt;/li&gt;
&lt;li&gt;Cloudflare Turnstile&lt;/li&gt;
&lt;li&gt;Cloudflare 5-second Challenge&lt;/li&gt;
&lt;li&gt;AWS WAF CAPTCHA&lt;/li&gt;
&lt;li&gt;And more&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Distinctive Advantage of This Integration
&lt;/h2&gt;

&lt;p&gt;Many CAPTCHA-solving integrations typically demand boilerplate code for task creation, result polling, and token injection into hidden fields. This is the conventional approach with raw Playwright or Puppeteer scripts.&lt;/p&gt;

&lt;p&gt;However, the Agent Browser + CapSolver combination adopts a fundamentally different methodology:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Traditional (Code-Based)&lt;/th&gt;
&lt;th&gt;Agent Browser + CapSolver Extension&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Requires writing a CapSolver service class&lt;/td&gt;
&lt;td&gt;Simply add the &lt;code&gt;--extension&lt;/code&gt; flag to your command&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Involves calling &lt;code&gt;createTask()&lt;/code&gt; / &lt;code&gt;getTaskResult()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;The extension manages all operations automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Necessitates token injection via JavaScript evaluation&lt;/td&gt;
&lt;td&gt;Token injection occurs invisibly&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Requires handling errors, retries, and timeouts within your code&lt;/td&gt;
&lt;td&gt;The extension internally manages retries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Demands different code for each CAPTCHA type&lt;/td&gt;
&lt;td&gt;Functions automatically for all types&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Headed mode is typically required for extensions&lt;/td&gt;
&lt;td&gt;Operates in both headed AND headless modes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The core principle&lt;/strong&gt;: The CapSolver extension operates within Agent Browser's Chrome instance. When Agent Browser navigates to a page containing a CAPTCHA, the extension detects it, resolves it in the background, and injects the token before your subsequent commands execute. This keeps your automation scripts streamlined, focused, and free from CAPTCHA-related complexities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prerequisites for Setup
&lt;/h2&gt;

&lt;p&gt;Before proceeding with the integration, ensure you have the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vercel Agent Browser&lt;/strong&gt; installed (&lt;code&gt;npm install -g agent-browser&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A CapSolver account&lt;/strong&gt; with an API key (&lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agent-browser-capsolver"&gt;register here&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node.js version 16 or higher&lt;/strong&gt; (required for npm installation)&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Important&lt;/strong&gt;: Unlike Playwright-based tools, Agent Browser supports extensions in &lt;strong&gt;both headed and headless modes&lt;/strong&gt;. There is no need for Xvfb or virtual display setups on servers.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step-by-Step Implementation Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Install Agent Browser
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; agent-browser
agent-browser &lt;span class="nb"&gt;install&lt;/span&gt;  &lt;span class="c"&gt;# Downloads Chrome from Chrome for Testing (first-time execution only)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Alternative installation methods:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# For macOS via Homebrew&lt;/span&gt;
brew &lt;span class="nb"&gt;install &lt;/span&gt;agent-browser
agent-browser &lt;span class="nb"&gt;install&lt;/span&gt;

&lt;span class="c"&gt;# Using Cargo (Rust package manager)&lt;/span&gt;
cargo &lt;span class="nb"&gt;install &lt;/span&gt;agent-browser
agent-browser &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For Linux systems, include necessary system dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--with-deps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 2: Obtain the CapSolver Chrome Extension
&lt;/h3&gt;

&lt;p&gt;Download the CapSolver Chrome extension and extract its contents into a designated directory:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Visit the &lt;a href="https://github.com/capsolver/capsolver-browser-extension/releases/tag/v.1.17.0" rel="noopener noreferrer"&gt;CapSolver Chrome Extension v1.17.0 release page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt; Download the &lt;code&gt;CapSolver.Browser.Extension-chrome-v1.17.0.zip&lt;/code&gt; file.&lt;/li&gt;
&lt;li&gt; Extract the archive:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; ~/capsolver-extension
unzip CapSolver.Browser.Extension-chrome-v&lt;span class="k"&gt;*&lt;/span&gt;.zip &lt;span class="nt"&gt;-d&lt;/span&gt; ~/capsolver-extension/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt; Confirm successful extraction:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;ls&lt;/span&gt; ~/capsolver-extension/manifest.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Presence of &lt;code&gt;manifest.json&lt;/code&gt; verifies correct placement of the extension files.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Configure Your CapSolver API Key
&lt;/h3&gt;

&lt;p&gt;Locate the extension's configuration file at &lt;code&gt;~/capsolver-extension/assets/config.js&lt;/code&gt; and update the &lt;code&gt;apiKey&lt;/code&gt; value with your personal key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;defaultConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CAP-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// ← Insert your API key here&lt;/span&gt;
  &lt;span class="na"&gt;useCapsolver&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="c1"&gt;// ... rest of the config&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your API key can be retrieved from your &lt;a href="https://dashboard.capsolver.com/passport/login/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agent-browser-capsolver"&gt;CapSolver dashboard&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Launch Agent Browser with the CapSolver Extension Enabled
&lt;/h3&gt;

&lt;p&gt;Activating the extension requires a single flag: &lt;code&gt;--extension&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com/protected-page
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this, the CapSolver extension is active within the browser and will automatically resolve any CAPTCHA it encounters.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;headed mode&lt;/strong&gt; (to observe the browser visually):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension &lt;span class="nt"&gt;--headed&lt;/span&gt; open https://example.com/protected-page
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 5: Verify Extension Loading
&lt;/h3&gt;

&lt;p&gt;In headed mode, navigate to &lt;code&gt;chrome://extensions&lt;/code&gt; to confirm that the CapSolver extension is listed and active:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension &lt;span class="nt"&gt;--headed&lt;/span&gt; open chrome://extensions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In headless mode, check the browser console for CapSolver's log messages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com
agent-browser console
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Practical Usage
&lt;/h2&gt;

&lt;p&gt;Once configured, using CapSolver with Agent Browser is straightforward; simply include the &lt;code&gt;--extension&lt;/code&gt; flag and a wait command.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Fundamental Principle
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Avoid implementing CAPTCHA-specific logic.&lt;/strong&gt; Instead, introduce a wait command after navigating to CAPTCHA-protected pages, allowing the extension to perform its function.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 1: Form Submission Protected by reCAPTCHA
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Navigate to the target page with the CapSolver extension loaded&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com/contact

&lt;span class="c"&gt;# Capture a snapshot to identify form elements&lt;/span&gt;
agent-browser snapshot &lt;span class="nt"&gt;-i&lt;/span&gt;
&lt;span class="c"&gt;# Expected Output:&lt;/span&gt;
&lt;span class="c"&gt;# - textbox "Name" [ref=e1]&lt;/span&gt;
&lt;span class="c"&gt;# - textbox "Email" [ref=e2]&lt;/span&gt;
&lt;span class="c"&gt;# - textbox "Message" [ref=e3]&lt;/span&gt;
&lt;span class="c"&gt;# - button "Submit" [ref=e4]&lt;/span&gt;

&lt;span class="c"&gt;# Populate the form fields&lt;/span&gt;
agent-browser fill @e1 &lt;span class="s2"&gt;"John Doe"&lt;/span&gt;
agent-browser fill @e2 &lt;span class="s2"&gt;"john@example.com"&lt;/span&gt;
agent-browser fill @e3 &lt;span class="s2"&gt;"Hello, I have a question about your services."&lt;/span&gt;

&lt;span class="c"&gt;# Allow CapSolver to resolve the CAPTCHA&lt;/span&gt;
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000

&lt;span class="c"&gt;# Submit the form—the CAPTCHA token will have already been injected&lt;/span&gt;
agent-browser click @e4
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scenario 2: Login Page Featuring Cloudflare Turnstile
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Access the login page&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com/login

&lt;span class="c"&gt;# Identify interactive elements&lt;/span&gt;
agent-browser snapshot &lt;span class="nt"&gt;-i&lt;/span&gt;

&lt;span class="c"&gt;# Input credentials&lt;/span&gt;
agent-browser find label &lt;span class="s2"&gt;"Email"&lt;/span&gt; fill &lt;span class="s2"&gt;"me@example.com"&lt;/span&gt;
agent-browser find label &lt;span class="s2"&gt;"Password"&lt;/span&gt; fill &lt;span class="s2"&gt;"mypassword123"&lt;/span&gt;

&lt;span class="c"&gt;# Wait for Turnstile resolution&lt;/span&gt;
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;20000

&lt;span class="c"&gt;# Click the login button—Turnstile will have been handled&lt;/span&gt;
agent-browser find role button click &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"Log in"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scenario 3: Data Extraction from Protected Web Pages
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Navigate to the protected page&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com/data

&lt;span class="c"&gt;# Wait for any CAPTCHA challenge to be cleared&lt;/span&gt;
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000

&lt;span class="c"&gt;# Extract page content using a snapshot&lt;/span&gt;
agent-browser snapshot &lt;span class="nt"&gt;--json&lt;/span&gt;

&lt;span class="c"&gt;# Alternatively, retrieve specific element text&lt;/span&gt;
agent-browser get text &lt;span class="s2"&gt;"body"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scenario 4: Chained Commands (Single Line Execution)
&lt;/h3&gt;

&lt;p&gt;Agent Browser supports command chaining for streamlined automation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Open a page, wait for CAPTCHA, fill a form, and submit—all in one command sequence&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com/contact &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  agent-browser snapshot &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  agent-browser fill @e1 &lt;span class="s2"&gt;"John Doe"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  agent-browser fill @e2 &lt;span class="s2"&gt;"john@example.com"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  agent-browser click @e3
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Scenario 5: Scripted Workflow with JSON Output
&lt;/h3&gt;

&lt;p&gt;For AI agent pipelines, utilize &lt;code&gt;--json&lt;/code&gt; for machine-readable output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nv"&gt;EXTENSION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/capsolver-extension

&lt;span class="c"&gt;# Open page with extension&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; &lt;span class="nv"&gt;$EXTENSION&lt;/span&gt; open https://example.com/protected-page

&lt;span class="c"&gt;# Wait for CAPTCHA to resolve&lt;/span&gt;
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000

&lt;span class="c"&gt;# Obtain snapshot as JSON for AI processing&lt;/span&gt;
&lt;span class="nv"&gt;SNAPSHOT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;agent-browser snapshot &lt;span class="nt"&gt;-i&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Parse references and interact&lt;/span&gt;
agent-browser click @e2
agent-browser get text &lt;span class="s2"&gt;"body"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Recommended Waiting Durations
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;CAPTCHA Type&lt;/th&gt;
&lt;th&gt;Typical Resolution Time&lt;/th&gt;
&lt;th&gt;Suggested Wait Period&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (checkbox)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;td&gt;30-60 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v2 (invisible)&lt;/td&gt;
&lt;td&gt;5-15 seconds&lt;/td&gt;
&lt;td&gt;30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reCAPTCHA v3&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;td&gt;20-30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cloudflare Turnstile&lt;/td&gt;
&lt;td&gt;3-10 seconds&lt;/td&gt;
&lt;td&gt;20-30 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Guidance&lt;/strong&gt;: When uncertain, a 30-second wait is generally advisable. It is preferable to wait slightly longer than to attempt submission prematurely. The additional waiting time does not negatively impact the outcome.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Behind the Scenes: How It Functions
&lt;/h2&gt;

&lt;p&gt;Here's an overview of the process when Agent Browser operates with the CapSolver extension loaded:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Your Agent Browser Commands
───────────────────────────────────────────────────
agent-browser --extension       ──►  Chrome launches with extension
  ~/capsolver-extension
  open https://...
                                           │
                                           ▼
                               ┌─────────────────────────────┐
                               │  Page with CAPTCHA widget     │
                               │                               │
                               │  CapSolver Extension:         │
                               │  1. Content script detects    │
                               │     CAPTCHA on the page       │
                               │  2. Service worker calls      │
                               │     CapSolver API             │
                               │  3. Token received            │
                               │  4. Token injected into       │
                               │     hidden form field         │
                               └─────────────────────────────┘
                                           │
                                           ▼
agent-browser wait 30000         Extension resolves CAPTCHA...
                                           │
                                           ▼
agent-browser snapshot -i        Agent Browser reads elements
agent-browser click @e2          Form submits WITH valid token
                                           │
                                           ▼
                               "Verification successful!"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Extension Loading Mechanism
&lt;/h3&gt;

&lt;p&gt;When Agent Browser initiates Chrome with the &lt;code&gt;--extension&lt;/code&gt; flag:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Chrome starts with the CapSolver extension pre-loaded (utilizing &lt;code&gt;--headless=new&lt;/code&gt; in headless mode, which supports Manifest V3 extensions).&lt;/li&gt;
&lt;li&gt; The extension becomes active—its service worker begins operation, and content scripts are injected into every page.&lt;/li&gt;
&lt;li&gt; On pages containing CAPTCHAs, the content script identifies the widget, invokes the CapSolver API, and injects the solution token into the page.&lt;/li&gt;
&lt;li&gt; Agent Browser continues its normal operations—snapshots, clicks, and data extraction proceed as usual, with CAPTCHAs already addressed.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Comprehensive Configuration Reference
&lt;/h2&gt;

&lt;p&gt;Below is a complete setup guide detailing all configuration options for the Agent Browser + CapSolver integration:&lt;/p&gt;

&lt;h3&gt;
  
  
  Command-Line Interface (CLI) Flags
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--headed&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--session-name&lt;/span&gt; my-session &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--profile&lt;/span&gt; ./browser-data &lt;span class="se"&gt;\&lt;/span&gt;
  open https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Environment Variables
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Define the extension path as an environment variable (eliminates repetitive --extension usage)&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;AGENT_BROWSER_EXTENSIONS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;~/capsolver-extension

&lt;span class="c"&gt;# Subsequent commands will automatically load the extension&lt;/span&gt;
agent-browser open https://example.com
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000
agent-browser snapshot &lt;span class="nt"&gt;-i&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Configuration File (&lt;code&gt;agent-browser.json&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Create an &lt;code&gt;agent-browser.json&lt;/code&gt; file in your project directory to establish persistent default settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"extension"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"~/capsolver-extension"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sessionName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"my-project"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"headed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Available Configuration Options
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Option&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--extension &amp;lt;path&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Specifies the path to the unpacked CapSolver extension directory containing &lt;code&gt;manifest.json&lt;/code&gt;. This flag can be repeated for multiple extensions.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--headed&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Displays the browser window for visual debugging purposes. Extensions are functional in both modes.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--session-name &amp;lt;name&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Automatically saves and restores cookies and local storage across browser restarts.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--profile &amp;lt;path&amp;gt;&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Designates a persistent browser profile directory (for cookies, IndexedDB, cache).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;AGENT_BROWSER_EXTENSIONS&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;An environment variable alternative to the &lt;code&gt;--extension&lt;/code&gt; flag. Accepts comma-separated paths for multiple extensions.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The CapSolver API key is configured directly within the extension's &lt;code&gt;assets/config.js&lt;/code&gt; file (refer to Step 3).&lt;/p&gt;

&lt;h2&gt;
  
  
  Troubleshooting Guide
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Extension Not Loading Correctly
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: CAPTCHAs are not being resolved automatically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Potential Causes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Incorrect extension path—verify that &lt;code&gt;manifest.json&lt;/code&gt; exists in the specified directory.&lt;/li&gt;
&lt;li&gt;Extension incompatibility—ensure you are using the Chrome version of the CapSolver extension, not the Firefox version.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Resolution&lt;/strong&gt;: Confirm the path and test extension loading:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Verify manifest file existence&lt;/span&gt;
&lt;span class="nb"&gt;ls&lt;/span&gt; ~/capsolver-extension/manifest.json

&lt;span class="c"&gt;# Test visually in headed mode&lt;/span&gt;
agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension &lt;span class="nt"&gt;--headed&lt;/span&gt; open chrome://extensions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  CAPTCHA Resolution Failure (Form Submission Issues)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Potential Causes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Insufficient wait time&lt;/strong&gt;—Increase the wait duration to 60 seconds.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invalid API key&lt;/strong&gt;—Cross-reference your CapSolver dashboard for the correct key.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Insufficient balance&lt;/strong&gt;—Recharge your CapSolver account credits.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extension not loaded&lt;/strong&gt;—Refer to the "Extension Not Loading Correctly" section above.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Debugging with console logs:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension open https://example.com
agent-browser &lt;span class="nb"&gt;wait &lt;/span&gt;30000
agent-browser console  &lt;span class="c"&gt;# Inspect CapSolver messages&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Chrome Executable Not Found
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Symptom&lt;/strong&gt;: &lt;code&gt;agent-browser&lt;/code&gt; is unable to locate a Chrome executable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resolution&lt;/strong&gt;: Execute the install command to download Chrome for Testing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alternatively, specify a custom Chrome executable path:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="nt"&gt;--executable-path&lt;/span&gt; /path/to/chrome open https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Utilizing Multiple Extensions
&lt;/h3&gt;

&lt;p&gt;You can load several extensions by repeating the &lt;code&gt;--extension&lt;/code&gt; flag:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;agent-browser &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/capsolver-extension &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--extension&lt;/span&gt; ~/another-extension &lt;span class="se"&gt;\&lt;/span&gt;
  open https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Best Practices for Integration
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Employ the &lt;code&gt;AGENT_BROWSER_EXTENSIONS&lt;/code&gt; environment variable.&lt;/strong&gt; Set this variable once in your shell profile or CI configuration. This ensures that every &lt;code&gt;agent-browser&lt;/code&gt; command automatically loads CapSolver without requiring the flag to be repeated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Always allocate ample wait times.&lt;/strong&gt; A more generous wait period enhances reliability. While CAPTCHAs typically resolve within 5-20 seconds, network latency, complex challenges, or retries can extend this duration. A range of 30-60 seconds is generally optimal.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Maintain clean automation scripts.&lt;/strong&gt; Avoid embedding CAPTCHA-specific logic directly into your commands. The extension handles all CAPTCHA processes transparently, allowing your scripts to focus solely on navigation, interaction, and data extraction.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Regularly monitor your CapSolver balance.&lt;/strong&gt; Each CAPTCHA resolution consumes credits. Periodically check your balance at &lt;a href="https://www.capsolver.com/dashboard" rel="noopener noreferrer"&gt;capsolver.com/dashboard&lt;/a&gt; to prevent service interruptions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Utilize session persistence for recurring visits.&lt;/strong&gt; Employ &lt;code&gt;--session-name&lt;/code&gt; or &lt;code&gt;--profile&lt;/code&gt; to retain cookies across multiple browser sessions. This can potentially reduce the frequency of CAPTCHA encounters, as the website may recognize returning sessions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Leverage headless mode in production environments.&lt;/strong&gt; Unlike Playwright, Agent Browser fully supports extensions in headless mode. This eliminates the need for Xvfb or virtual displays on servers, allowing direct execution of your commands.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;The integration of Vercel Agent Browser with CapSolver provides an invisible CAPTCHA-solving capability for the fastest, most AI-optimized browser automation CLI available. Instead of developing intricate CAPTCHA-handling code, you simply need to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Download and configure the CapSolver extension with your API key.&lt;/li&gt;
&lt;li&gt; Add &lt;code&gt;--extension ~/capsolver-extension&lt;/code&gt; to your Agent Browser commands.&lt;/li&gt;
&lt;li&gt; Include a wait command before interacting with forms protected by CAPTCHAs.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The CapSolver Chrome extension manages the entire process—detecting CAPTCHAs, resolving them via the CapSolver API, and injecting tokens into the page. Your Agent Browser commands can thus remain entirely oblivious to CAPTCHA challenges.&lt;/p&gt;

&lt;p&gt;Furthermore, in contrast to Playwright-based solutions that often necessitate headed mode and virtual displays, Agent Browser supports extensions in headless mode natively. This makes it the most straightforward approach for achieving CAPTCHA-free automation in production settings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ready to begin?&lt;/strong&gt; &lt;a href="https://www.capsolver.com/?utm_source=dev.to&amp;amp;utm_medium=article&amp;amp;utm_campaign=agent-browser-capsolver"&gt;Sign up for CapSolver&lt;/a&gt; and use the bonus code &lt;strong&gt;AGENTBROWSER&lt;/strong&gt; to receive an additional 6% on your initial top-up!&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzrqax7kzr00l6es7j0h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvzrqax7kzr00l6es7j0h.png" width="527" height="242"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions (FAQ)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is CAPTCHA-specific code necessary?
&lt;/h3&gt;

&lt;p&gt;No. The CapSolver extension operates entirely in the background within Agent Browser's Chrome instance. By simply adding an &lt;code&gt;agent-browser wait 30000&lt;/code&gt; command before submitting forms, the extension automatically handles detection, resolution, and token injection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can this be executed in headless mode?
&lt;/h3&gt;

&lt;p&gt;Yes! This represents a significant advantage over Playwright-based solutions. Agent Browser utilizes Chrome's &lt;code&gt;--headless=new&lt;/code&gt; mode, which supports Manifest V3 extensions, eliminating the need for Xvfb or virtual display setups.&lt;/p&gt;

&lt;h3&gt;
  
  
  Are Playwright or Node.js required?
&lt;/h3&gt;

&lt;p&gt;No. Agent Browser is a self-contained Rust binary. Node.js is only necessary for the &lt;code&gt;npm install&lt;/code&gt; step. The browser daemon runs natively without any JavaScript runtime.&lt;/p&gt;

&lt;h3&gt;
  
  
  Which CAPTCHA types does CapSolver support?
&lt;/h3&gt;

&lt;p&gt;CapSolver supports a wide range of CAPTCHA types, including reCAPTCHA v2 (checkbox and invisible), reCAPTCHA v3, Cloudflare Turnstile, and AWS WAF CAPTCHA, among others. The extension automatically identifies and resolves the appropriate CAPTCHA type.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the cost of CapSolver?
&lt;/h3&gt;

&lt;p&gt;CapSolver offers competitive pricing structures based on CAPTCHA type and volume. For current pricing details, please visit &lt;a href="https://www.capsolver.com" rel="noopener noreferrer"&gt;capsolver.com&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Vercel Agent Browser free to use?
&lt;/h3&gt;

&lt;p&gt;Yes. Agent Browser is an open-source project released under the Apache 2.0 license. The CLI and all its features are available for free. Further information can be found on its &lt;a href="https://github.com/vercel-labs/agent-browser" rel="noopener noreferrer"&gt;GitHub repository&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the recommended waiting period for CAPTCHA resolution?
&lt;/h3&gt;

&lt;p&gt;For most CAPTCHAs, a waiting period of 30-60 seconds is sufficient. Actual resolution times typically range from 5-20 seconds, but an extended buffer ensures greater reliability. When in doubt, use &lt;code&gt;agent-browser wait 30000&lt;/code&gt; for 30 seconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is this compatible with AI agents?
&lt;/h3&gt;

&lt;p&gt;Absolutely. Agent Browser was specifically developed for AI agents (&lt;a href="https://www.capsolver.com/blog/AI/best-ai-agents" rel="noopener noreferrer"&gt;explore various AI agent options here&lt;/a&gt;). It offers &lt;code&gt;--json&lt;/code&gt; for machine-readable output, a snapshot-ref workflow for precise element selection, and command chaining for efficient multi-step automation. The CapSolver extension operates transparently alongside your agent's commands.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>browser</category>
      <category>captcha</category>
    </item>
  </channel>
</rss>
