<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: 许映洲</title>
    <description>The latest articles on DEV Community by 许映洲 (@_ab214f84f83a01455a74b).</description>
    <link>https://dev.to/_ab214f84f83a01455a74b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3953937%2Fa508106a-2da4-432e-9a1e-38c1608fc027.png</url>
      <title>DEV Community: 许映洲</title>
      <link>https://dev.to/_ab214f84f83a01455a74b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/_ab214f84f83a01455a74b"/>
    <language>en</language>
    <item>
      <title>I Spent 3 Hours Debugging a CSS Selector. Then I Found a Better Way.</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Fri, 29 May 2026 12:28:56 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/i-spent-3-hours-debugging-a-css-selector-then-i-found-a-better-way-5o4</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/i-spent-3-hours-debugging-a-css-selector-then-i-found-a-better-way-5o4</guid>
      <description>&lt;h1&gt;
  
  
  I Spent 3 Hours Debugging a CSS Selector. Then I Found a Better Way.
&lt;/h1&gt;

&lt;p&gt;Last Tuesday I needed to scrape product prices from an e-commerce site. "Simple," I thought. Just query the price element and done.&lt;/p&gt;

&lt;p&gt;Three hours later I was still staring at my screen, cursing whoever decided to use dynamically-generated class names.&lt;/p&gt;

&lt;h2&gt;
  
  
  What went wrong
&lt;/h2&gt;

&lt;p&gt;The page used something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;span&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;"price sc-dkzDqf jhKMRz eBvMAx"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;¥199&lt;span class="nt"&gt;&amp;lt;/span&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Those classes? They change every deploy. My &lt;code&gt;querySelector('.price')&lt;/code&gt; worked for about 2 days, then the frontend team pushed an update and everything broke. No warning, no deprecation, just... gone.&lt;/p&gt;

&lt;p&gt;I tried being smarter. Used &lt;code&gt;[class*="price"]&lt;/code&gt; — worked until they added a "price-match" badge that had nothing to do with actual prices. Then I tried &lt;code&gt;span.price&lt;/code&gt; — but sometimes the price was in a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;At 11 PM I realized: I was spending more time maintaining selectors than actually getting data.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shift in thinking
&lt;/h2&gt;

&lt;p&gt;Here's what I eventually figured out: instead of hard-coding selectors, what if I could just describe what I wanted in plain English and let the tool figure out the DOM?&lt;/p&gt;

&lt;p&gt;Something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Instead of writing this:&lt;/span&gt;
page.querySelector&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'.price.sc-dkzDqf'&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# What if I could just do this:&lt;/span&gt;
extract &lt;span class="s2"&gt;"the product price"&lt;/span&gt; &lt;span class="nt"&gt;--from&lt;/span&gt; https://example.com/product/123
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's basically what I've been using for the past month. The tool figures out the selector, and when the page changes, it re-discovers it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real numbers from a real project
&lt;/h2&gt;

&lt;p&gt;I tracked my time for 2 weeks before and after:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before&lt;/strong&gt;: ~45 min per site to write scraper, then 10-15 min/week fixing broken selectors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After&lt;/strong&gt;: ~5 min per site to set up, 0 min/week maintenance (selectors auto-recover)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Over a month, across 8 sites, that's roughly 12 hours saved. Not life-changing, but enough to actually go home on time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The catch
&lt;/h2&gt;

&lt;p&gt;It's not perfect. Complex interactions (multi-step forms, SPAs with heavy JS) still need manual scripting. And if a page changes its entire structure, no amount of auto-discovery will save you.&lt;/p&gt;

&lt;p&gt;But for 80% of my scraping tasks — get text, get prices, get links — it's been surprisingly reliable.&lt;/p&gt;

&lt;p&gt;If you're spending more time fixing selectors than building features, maybe it's time to try a different approach. The &lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;xbrowser CLI&lt;/a&gt; is what I've been using — it's open source, works with your existing Chrome, and doesn't require writing 50 lines of Puppeteer every time.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>webdev</category>
      <category>testing</category>
    </item>
    <item>
      <title>I Replaced 30 Minutes of Daily Browser Chores with One Cron Job</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 16:05:27 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/i-replaced-30-minutes-of-daily-browser-chores-with-one-cron-job-44eb</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/i-replaced-30-minutes-of-daily-browser-chores-with-one-cron-job-44eb</guid>
      <description>&lt;p&gt;Every morning, I'd sit down at my desk, grab a coffee, and open my browser.&lt;/p&gt;

&lt;p&gt;Then the ritual began.&lt;/p&gt;

&lt;p&gt;Log into Google Search Console. Check yesterday's indexing stats. Switch to Ahrefs. Look up a handful of keyword rankings. Open Bing Webmaster Tools. Manually submit the three articles I published last night. Jump over to a backlink checker. See if those reciprocal links are still alive.&lt;/p&gt;

&lt;p&gt;Done? Not quite. I still had to open Medium and publish yesterday's draft. Then Dev.to. Then schedule a Reddit post in the relevant subreddit.&lt;/p&gt;

&lt;p&gt;By the time I finished all of this, my coffee was cold and a third of my morning was gone.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Do All These Tasks Have in Common?
&lt;/h2&gt;

&lt;p&gt;If you look closely, every single one of these operations follows the same pattern:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open a browser&lt;/li&gt;
&lt;li&gt;Navigate to a URL&lt;/li&gt;
&lt;li&gt;Log in (and re-log in when the session expires)&lt;/li&gt;
&lt;li&gt;Find the right input field or button&lt;/li&gt;
&lt;li&gt;Fill in content or click something&lt;/li&gt;
&lt;li&gt;Wait for the result&lt;/li&gt;
&lt;li&gt;Close the tab&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of these steps are hard. But every single one requires &lt;strong&gt;your personal involvement&lt;/strong&gt;. Your hand bounces between mouse and keyboard. Your eyes dart across a dozen tabs.&lt;/p&gt;

&lt;p&gt;Here's the thing — you're a paid engineer, not a human Selenium.&lt;/p&gt;

&lt;p&gt;What if these operations could be turned into command-line tasks?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Submit sitemap&lt;/span&gt;
seo ping &lt;span class="nt"&gt;--sitemap&lt;/span&gt; https://mysite.com/sitemap.xml

&lt;span class="c"&gt;# Check Google rankings&lt;/span&gt;
google search &lt;span class="s2"&gt;"my keyword"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; | jq &lt;span class="s1"&gt;'.[0].position'&lt;/span&gt;

&lt;span class="c"&gt;# Publish an article&lt;/span&gt;
devto publish &lt;span class="nt"&gt;--file&lt;/span&gt; article.md &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="s2"&gt;"javascript,webdev"&lt;/span&gt;

&lt;span class="c"&gt;# Check backlinks&lt;/span&gt;
curl-check &lt;span class="nt"&gt;--url&lt;/span&gt; https://partner.com &lt;span class="nt"&gt;--find&lt;/span&gt; &lt;span class="s2"&gt;"mysite.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One command. Press Enter. Done.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Manual to Automated: A Three-Step Process
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Abstract Operations into Commands
&lt;/h3&gt;

&lt;p&gt;Any browser operation can be decomposed into: open page → locate element → perform action → get result.&lt;/p&gt;

&lt;p&gt;Take "submit sitemap to Google." Manually, you'd:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;code&gt;https://www.google.com/ping?sitemap=&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Append your sitemap URL&lt;/li&gt;
&lt;li&gt;Press Enter&lt;/li&gt;
&lt;li&gt;See the success message&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;As a CLI command, this is one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser navigate &lt;span class="s2"&gt;"https://www.google.com/ping?sitemap=https://mysite.com/sitemap.xml"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or take "check if a new article is indexed." Manually:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open Google&lt;/li&gt;
&lt;li&gt;Search &lt;code&gt;site:mysite.com/my-new-article&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Check if there are results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CLI version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser google search &lt;span class="s2"&gt;"site:mysite.com/my-new-article"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You get structured JSON back. You can filter with &lt;code&gt;jq&lt;/code&gt;, search with &lt;code&gt;grep&lt;/code&gt;, pipe it into other commands.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Combine Commands into a Script
&lt;/h3&gt;

&lt;p&gt;Individual commands solve individual tasks. But your daily work is a sequence of tasks — the kind you repeat every morning.&lt;/p&gt;

&lt;p&gt;So write a script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# daily-seo-check.sh&lt;/span&gt;

&lt;span class="nv"&gt;SITE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://mysite.com"&lt;/span&gt;
&lt;span class="nv"&gt;KEYWORDS&lt;/span&gt;&lt;span class="o"&gt;=(&lt;/span&gt;&lt;span class="s2"&gt;"javascript tutorial"&lt;/span&gt; &lt;span class="s2"&gt;"node.js guide"&lt;/span&gt; &lt;span class="s2"&gt;"react best practices"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;LOG_FILE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"/var/log/seo-&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;.log"&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=== SEO Daily Report &lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; ==="&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Submit sitemap&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[1/4] Submitting sitemap..."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
xbrowser seo ping &lt;span class="nt"&gt;--sitemap&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SITE&lt;/span&gt;&lt;span class="s2"&gt;/sitemap.xml"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; 2&amp;gt;&amp;amp;1

&lt;span class="c"&gt;# Check rankings&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[2/4] Checking rankings..."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;for &lt;/span&gt;kw &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;KEYWORDS&lt;/span&gt;&lt;span class="p"&gt;[@]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;position&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;xbrowser google search &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$kw&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.[0].position // "Not found"'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  &lt;/span&gt;&lt;span class="nv"&gt;$kw&lt;/span&gt;&lt;span class="s2"&gt;: Position &lt;/span&gt;&lt;span class="nv"&gt;$position&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt;

&lt;span class="c"&gt;# Check index count&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[3/4] Checking index count..."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;xbrowser google search &lt;span class="s2"&gt;"site:&lt;/span&gt;&lt;span class="nv"&gt;$SITE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; | jq &lt;span class="s1"&gt;'length'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  Indexed pages: &lt;/span&gt;&lt;span class="nv"&gt;$count&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="c"&gt;# Check backlinks&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"[4/4] Checking backlinks..."&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nv"&gt;IFS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;','&lt;/span&gt; &lt;span class="nb"&gt;read&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; url anchor&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;&lt;span class="nv"&gt;found&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;xbrowser crawl &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--find&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$anchor&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.found'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"  &lt;/span&gt;&lt;span class="nv"&gt;$url&lt;/span&gt;&lt;span class="s2"&gt;: &lt;/span&gt;&lt;span class="nv"&gt;$found&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;done&lt;/span&gt; &amp;lt; backlinks.csv

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Done. Report saved to &lt;/span&gt;&lt;span class="nv"&gt;$LOG_FILE&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This script does everything you used to do manually every morning — submit sitemap, check rankings, count indexed pages, verify backlinks.&lt;/p&gt;

&lt;p&gt;Run it once:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;chmod&lt;/span&gt; +x daily-seo-check.sh
./daily-seo-check.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Twenty seconds later, you have your daily data. No browser. No clicking around. No waiting for pages to load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Hand It Off to Cron
&lt;/h3&gt;

&lt;p&gt;The script works. Running it manually takes 20 seconds. But you'll still forget sometimes.&lt;/p&gt;

&lt;p&gt;So let the machine do it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Edit crontab&lt;/span&gt;
crontab &lt;span class="nt"&gt;-e&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Add these lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Daily SEO check at 9 AM&lt;/span&gt;
0 9 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /home/user/scripts/daily-seo-check.sh

&lt;span class="c"&gt;# Hourly ranking check, append to log&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; xbrowser google search &lt;span class="s2"&gt;"javascript tutorial"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/rankings-hourly.log

&lt;span class="c"&gt;# Submit sitemap at 3 AM daily (off-peak)&lt;/span&gt;
0 3 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; xbrowser seo ping &lt;span class="nt"&gt;--sitemap&lt;/span&gt; https://mysite.com/sitemap.xml

&lt;span class="c"&gt;# Weekly backlink audit, every Monday at 8 AM&lt;/span&gt;
0 8 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; 1 /home/user/scripts/check-backlinks.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save and exit. From now on, these tasks run automatically.&lt;/p&gt;

&lt;p&gt;The first thing you do each morning isn't opening a browser — it's checking the log:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /var/log/seo-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output looks something like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;=== SEO Daily Report 2025-01-15 09:00:01 ===
[1/4] Submitting sitemap... OK
[2/4] Checking rankings...
  javascript tutorial: Position 7
  node.js guide: Position 12
  react best practices: Position 3
[3/4] Checking index count...
  Indexed pages: 847
[4/4] Checking backlinks...
  https://partner1.com: found
  https://partner2.com: found
  https://partner3.com: NOT FOUND ⚠️
Done.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thirty seconds to review. Infinitely faster than doing it by hand.&lt;/p&gt;

&lt;h2&gt;
  
  
  Not Just SEO: Any Repetitive Browser Task Can Be Automated
&lt;/h2&gt;

&lt;p&gt;Maybe you're not an SEO engineer. But you definitely have browser tasks you repeat daily.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developers&lt;/strong&gt;: Check GitHub Actions build status every morning? See if CI passed? Verify npm package publication?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Hourly CI monitor&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; xbrowser github actions &lt;span class="nt"&gt;--repo&lt;/span&gt; myorg/myrepo &lt;span class="nt"&gt;--status&lt;/span&gt; failed &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/ci-monitor.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Content creators&lt;/strong&gt;: Need to cross-post to Medium, Dev.to, and Hashnode every day?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# One command, multiple platforms&lt;/span&gt;
xbrowser devto publish &lt;span class="nt"&gt;--file&lt;/span&gt; article.md &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="s2"&gt;"js,webdev"&lt;/span&gt;
xbrowser medium publish &lt;span class="nt"&gt;--file&lt;/span&gt; article.md &lt;span class="nt"&gt;--tags&lt;/span&gt; &lt;span class="s2"&gt;"javascript,web-development"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;E-commerce operators&lt;/strong&gt;: Check competitor pricing, store ratings, inventory alerts daily?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Daily competitor price check at 8 AM&lt;/span&gt;
0 8 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; xbrowser crawl &lt;span class="s2"&gt;"https://competitor.com/product/123"&lt;/span&gt; &lt;span class="nt"&gt;--extract&lt;/span&gt; &lt;span class="s1"&gt;'.price'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/price-monitor.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Social media managers&lt;/strong&gt;: Schedule posts, check engagement metrics, monitor brand mentions?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Hourly brand mention check&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; xbrowser twitter search &lt;span class="s2"&gt;"mybrand"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; | jq &lt;span class="s1"&gt;'.[].text'&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /var/log/brand-mentions.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The specific task doesn't matter. What matters is that &lt;strong&gt;the pattern is always the same&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Power of Pipes: CLI's Real Advantage
&lt;/h2&gt;

&lt;p&gt;You might be thinking: "I could do all this with a Python script."&lt;/p&gt;

&lt;p&gt;You're right. But CLI has something Python scripts can't match: &lt;strong&gt;pipe composition&lt;/strong&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Check rankings → filter top 10 → send to Slack&lt;/span&gt;
xbrowser google search &lt;span class="s2"&gt;"javascript tutorial"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | jq &lt;span class="s1"&gt;'[.[] | select(.position &amp;lt;= 10)]'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SLACK_WEBHOOK&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt; @-
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Scrape competitor price → compare threshold → email alert&lt;/span&gt;
xbrowser crawl &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMPETITOR_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--extract&lt;/span&gt; &lt;span class="s1"&gt;'.price'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;awk&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="nv"&gt;threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;99 &lt;span class="s1"&gt;'{if ($1 &amp;lt; threshold) print "ALERT: Competitor price dropped to " $1}'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"Price Alert"&lt;/span&gt; me@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Daily SEO report → generate Markdown → convert to PDF → email&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /var/log/seo-&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; +%Y%m%d&lt;span class="si"&gt;)&lt;/span&gt;.log &lt;span class="se"&gt;\&lt;/span&gt;
  | pandoc &lt;span class="nt"&gt;-f&lt;/span&gt; markdown &lt;span class="nt"&gt;-o&lt;/span&gt; /tmp/seo-report.pdf &lt;span class="se"&gt;\&lt;/span&gt;
  | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"Daily SEO Report"&lt;/span&gt; team@example.com &lt;span class="nt"&gt;-A&lt;/span&gt; /tmp/seo-report.pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each command does one thing well. Pipes chain them together. This is the Unix philosophy at work — and it's why CLI automation is more elegant than writing monolithic scripts.&lt;/p&gt;

&lt;p&gt;You don't need to write a new script for every new requirement. You just recombine existing commands with pipes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Common Concerns
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;"Won't automated actions get my account banned?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not if you control the frequency. Cron's minimum granularity is one minute. Set it to run once per hour, and nobody will mistake you for a bot. It's exactly like doing it manually — except you don't have to sit at your computer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;"What if the operation fails?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Scripts can handle errors just fine. Cron can send emails on failure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Send email notification on failure&lt;/span&gt;
0 9 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; /home/user/scripts/daily-seo.sh &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"SEO check failed"&lt;/span&gt; | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"Alert"&lt;/span&gt; me@example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or get more granular with &lt;code&gt;set -e&lt;/code&gt; and &lt;code&gt;trap&lt;/code&gt; inside the script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-euo&lt;/span&gt; pipefail

cleanup&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Script failed at line &lt;/span&gt;&lt;span class="nv"&gt;$1&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | mail &lt;span class="nt"&gt;-s&lt;/span&gt; &lt;span class="s2"&gt;"SEO Script Error"&lt;/span&gt; me@example.com
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="nb"&gt;trap&lt;/span&gt; &lt;span class="s1"&gt;'cleanup $LINENO'&lt;/span&gt; ERR
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;"I don't have a server. Can I run cron locally?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;macOS has &lt;code&gt;launchd&lt;/code&gt; (more powerful than cron). Linux has &lt;code&gt;systemd timers&lt;/code&gt;. Windows has Task Scheduler. And cron itself works everywhere.&lt;/p&gt;

&lt;p&gt;If you have a spare VPS (the $5/month kind), even better — upload your scripts, configure cron, and forget about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's Do the Math
&lt;/h2&gt;

&lt;p&gt;Conservatively, here's how much time an SEO engineer spends on daily repetitive browser tasks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Manual Time&lt;/th&gt;
&lt;th&gt;CLI Time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Submit sitemap&lt;/td&gt;
&lt;td&gt;3 min&lt;/td&gt;
&lt;td&gt;2 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check 5 keyword rankings&lt;/td&gt;
&lt;td&gt;10 min&lt;/td&gt;
&lt;td&gt;10 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check index count&lt;/td&gt;
&lt;td&gt;2 min&lt;/td&gt;
&lt;td&gt;2 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check 10 backlinks&lt;/td&gt;
&lt;td&gt;15 min&lt;/td&gt;
&lt;td&gt;5 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Publish to 3 platforms&lt;/td&gt;
&lt;td&gt;10 min&lt;/td&gt;
&lt;td&gt;15 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Check competitor pricing&lt;/td&gt;
&lt;td&gt;5 min&lt;/td&gt;
&lt;td&gt;3 sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;45 min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;37 sec&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;45 minutes saved per day. That's 22.5 hours per month — nearly three full workdays.&lt;/p&gt;

&lt;p&gt;You could spend that time writing code, doing analysis, or honestly, just taking a break. All of those are better than clicking around in a browser.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Do Today
&lt;/h2&gt;

&lt;p&gt;If you want to start right now, I'd suggest beginning with the &lt;strong&gt;smallest possible automation&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identify the browser task you repeat most often today&lt;/li&gt;
&lt;li&gt;Find a CLI command that does the same thing&lt;/li&gt;
&lt;li&gt;Verify the result is correct&lt;/li&gt;
&lt;li&gt;Add it to cron&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Don't try to automate everything at once. Automate one thing. Save some time. Then automate the next thing.&lt;/p&gt;

&lt;p&gt;Iterative. No rush.&lt;/p&gt;




&lt;p&gt;I've been using &lt;a href="https://github.com/yanqdinho/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; for this — a browser automation CLI that wraps Puppeteer's capabilities into a command-line interface. It supports Google search, web scraping, SEO ping, and other common operations, and works well with cron for scheduled automation. If you have similar daily repetitive tasks, give it a try.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>automation</category>
      <category>webdev</category>
      <category>seo</category>
    </item>
    <item>
      <title>Playwright Is a Test Framework, Not an Automation Tool — Here's the Difference</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 15:57:15 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/playwright-is-a-test-framework-not-an-automation-tool-heres-the-difference-1158</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/playwright-is-a-test-framework-not-an-automation-tool-heres-the-difference-1158</guid>
      <description>&lt;p&gt;A friend asked me last week: "I want to search Google from the command line and get the results. How hard can it be?"&lt;/p&gt;

&lt;p&gt;"Easy," I said. "Just use Playwright."&lt;/p&gt;

&lt;p&gt;He came back the next day: "I spent all afternoon reading Playwright docs and I still haven't gotten it working. I'm at 30 lines of code and I don't even have the search results yet."&lt;/p&gt;

&lt;p&gt;I looked at his code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;playwright&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newContext&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.google.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input[name="q"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;node.js best practices&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;press&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;input[name="q"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Enter&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="nf"&gt;$eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.g&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;els&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="nx"&gt;els&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}))&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;})();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;23 lines. Just to search Google.&lt;/p&gt;

&lt;p&gt;And that code is fragile — Google's search result page structure changes frequently, and those CSS selectors could break at any moment. When they do, you have to open the browser, debug the DOM, find new selectors, update the code, and run it again.&lt;/p&gt;

&lt;p&gt;"Isn't there a simpler way?"&lt;/p&gt;

&lt;p&gt;There is. But before we get to that, we need to clear up a fundamental misunderstanding.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test Framework ≠ Automation Tool
&lt;/h2&gt;

&lt;p&gt;This is a distinction most people miss. Playwright, Selenium, Cypress — they are all &lt;strong&gt;test frameworks&lt;/strong&gt;, not &lt;strong&gt;automation tools&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;A test framework's core design goals are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Assertion-driven&lt;/strong&gt;: Verify that page behavior matches expectations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Report generation&lt;/strong&gt;: Produce HTML/JSON test reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel execution&lt;/strong&gt;: Run tests across multiple browsers and devices simultaneously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CI integration&lt;/strong&gt;: Work within GitHub Actions, Jenkins, and other pipelines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debugging tools&lt;/strong&gt;: Trace viewers, screenshot diffing, video playback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These features matter when you're writing tests. But if you just want to "search Google and get results"?&lt;/p&gt;

&lt;p&gt;You don't need assertions. You don't need test reports. You don't need parallel execution. You need: open page → extract data → move on.&lt;/p&gt;

&lt;p&gt;An automation tool's core design goals are completely different:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CLI-driven&lt;/strong&gt;: One command, one operation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-shot execution&lt;/strong&gt;: No need for persistent test suites&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipe-friendly&lt;/strong&gt;: Output can be processed by other commands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fast feedback&lt;/strong&gt;: Seconds, not minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Script-friendly&lt;/strong&gt;: Embed directly in bash scripts or cron jobs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words: &lt;strong&gt;test frameworks care about "is it correct?" Automation tools care about "is it fast?"&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Approaches, One Task
&lt;/h2&gt;

&lt;p&gt;Let's compare three approaches using the same task: Google search for "node.js tutorial", extract the top 10 results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Approach 1: Playwright
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// search.js&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;playwright&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chromium&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://www.google.com/search?q=node.js+tutorial&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="nf"&gt;$eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.g&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;elements&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
    &lt;span class="nx"&gt;elements&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h3&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.VwiC3b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}))&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;})();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install &lt;/span&gt;playwright    &lt;span class="c"&gt;# Install dependencies, download browser (~300MB)&lt;/span&gt;
node search.js            &lt;span class="c"&gt;# Run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt;: 300MB browser download + 23 lines of code + ongoing selector maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Approach 2: Selenium
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# search.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;selenium&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;webdriver&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;selenium.webdriver.common.by&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;By&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;selenium.webdriver.support.ui&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;WebDriverWait&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;selenium.webdriver.support&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;expected_conditions&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;EC&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="n"&gt;driver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;webdriver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Chrome&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.google.com/search?q=node.js+tutorial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nc"&gt;WebDriverWait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;until&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;EC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;presence_of_element_located&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="n"&gt;elements&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_elements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.g&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)[:&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;elements&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;h3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_elements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;h3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get_attribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;href&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_elements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;snippet&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_element&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.VwiC3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_elements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;By&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CSS_SELECTOR&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;.VwiC3b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;driver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;selenium          &lt;span class="c"&gt;# Install dependency&lt;/span&gt;
&lt;span class="c"&gt;# Also need to download ChromeDriver separately&lt;/span&gt;
python search.py              &lt;span class="c"&gt;# Run&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt;: Install dependencies + download ChromeDriver + 27 lines of code + ongoing selector maintenance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Approach 3: CLI Tool
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser google search &lt;span class="s2"&gt;"node.js tutorial"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx xbrowser google search &lt;span class="s2"&gt;"node.js tutorial"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Cost&lt;/strong&gt;: One command.&lt;/p&gt;

&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Node.js Tutorial - W3Schools"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.w3schools.com/nodejs/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"snippet"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Learn Node.js with our comprehensive tutorial..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Node.js Handbook - FreeCodeCamp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.freecodecamp.org/news/the-nodejs-handbook/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"snippet"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"This handbook is a getting-started guide to Node.js..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No browser to install. No code to write. No selectors to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Playwright&lt;/th&gt;
&lt;th&gt;Selenium&lt;/th&gt;
&lt;th&gt;CLI Tool&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Purpose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Test framework&lt;/td&gt;
&lt;td&gt;Test framework&lt;/td&gt;
&lt;td&gt;Automation tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Install size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~300MB&lt;/td&gt;
&lt;td&gt;~150MB + Driver&lt;/td&gt;
&lt;td&gt;~50MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code per operation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20-50 lines&lt;/td&gt;
&lt;td&gt;20-50 lines&lt;/td&gt;
&lt;td&gt;1 line&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Learning curve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High (Page Objects, Locators, Contexts)&lt;/td&gt;
&lt;td&gt;High (WebDriver protocol, wait strategies)&lt;/td&gt;
&lt;td&gt;Low (command-line args)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Selector maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Manual&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Output format&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Raw (format yourself)&lt;/td&gt;
&lt;td&gt;Raw (format yourself)&lt;/td&gt;
&lt;td&gt;Native JSON&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pipe support&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Requires extra handling&lt;/td&gt;
&lt;td&gt;Requires extra handling&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Regression tests, E2E tests&lt;/td&gt;
&lt;td&gt;Compatibility tests, cross-browser&lt;/td&gt;
&lt;td&gt;Daily automation, data extraction&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Target audience&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;QA engineers&lt;/td&gt;
&lt;td&gt;QA engineers&lt;/td&gt;
&lt;td&gt;Developers, ops, content creators&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;CI integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;td&gt;Via shell scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Assertion capability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Powerful&lt;/td&gt;
&lt;td&gt;Powerful&lt;/td&gt;
&lt;td&gt;None (use jq/awk)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  They're Not Replacements — They're Complements
&lt;/h2&gt;

&lt;p&gt;I'm not here to knock Playwright or Selenium. They're top-tier tools in their domains.&lt;/p&gt;

&lt;p&gt;If you're doing &lt;strong&gt;regression testing&lt;/strong&gt; — verifying 50 pages worth of core functionality before every release — Playwright is the right choice. You need assertions. You need test reports. You need the Trace Viewer to debug failing cases.&lt;/p&gt;

&lt;p&gt;If you're doing &lt;strong&gt;compatibility testing&lt;/strong&gt; — making sure your product works on Chrome, Firefox, and Safari — Selenium's cross-browser capabilities are unmatched.&lt;/p&gt;

&lt;p&gt;But if you're doing any of these:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Checking Google rankings on a schedule&lt;/li&gt;
&lt;li&gt;Scraping web data in bulk&lt;/li&gt;
&lt;li&gt;Submitting sitemaps automatically&lt;/li&gt;
&lt;li&gt;Monitoring competitor price changes&lt;/li&gt;
&lt;li&gt;Cross-posting content to multiple platforms&lt;/li&gt;
&lt;li&gt;Quickly searching and getting structured results from the command line&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You don't need a test framework. You need an &lt;strong&gt;automation tool&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Using a test framework for automation is like using a cannon to kill a mosquito. It works, but the cost is way too high.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use What
&lt;/h2&gt;

&lt;p&gt;Here's a simple heuristic:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You need to verify "is it correct?"&lt;/strong&gt; → Test framework&lt;br&gt;
&lt;strong&gt;You need to quickly "get something done"&lt;/strong&gt; → Automation tool&lt;/p&gt;

&lt;p&gt;More specifically:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Playwright / Selenium when&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;E2E testing: Ensure signup flows and payment flows don't break&lt;/li&gt;
&lt;li&gt;Regression testing: Run full suite before every release&lt;/li&gt;
&lt;li&gt;Visual regression: Screenshot comparison to detect UI changes&lt;/li&gt;
&lt;li&gt;Performance testing: Measure page load times and interaction latency&lt;/li&gt;
&lt;li&gt;CI/CD integration: Run automatically in your pipeline&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use CLI automation tools when&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Data extraction: Pull structured data from web pages&lt;/li&gt;
&lt;li&gt;SEO operations: Check rankings, submit links, verify indexing&lt;/li&gt;
&lt;li&gt;Content management: Bulk publishing, scheduled publishing&lt;/li&gt;
&lt;li&gt;Monitoring and alerting: Periodic page status checks, price change detection&lt;/li&gt;
&lt;li&gt;Daily chores: Search, screenshot, form filling, downloading&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;With shell scripts + CLI tools, you can even do things test frameworks aren't designed for:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Monitor competitor price every hour, auto-alert on Slack when it drops&lt;/span&gt;
watch &lt;span class="nt"&gt;-n&lt;/span&gt; 3600 &lt;span class="s1"&gt;'xbrowser crawl "$COMPETITOR_URL" --extract ".price" \
  | xargs -I{} bash -c "[[ {} &amp;lt; 99 ]] &amp;amp;&amp;amp; echo Price dropped to {} | slacksend"'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Daily ranking report at 9 AM, generate Markdown&lt;/span&gt;
0 9 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;for &lt;/span&gt;kw &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"js tutorial"&lt;/span&gt; &lt;span class="s2"&gt;"node guide"&lt;/span&gt; &lt;span class="s2"&gt;"react tips"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"### &lt;/span&gt;&lt;span class="nv"&gt;$kw&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; xbrowser google search &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$kw&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 3 &lt;span class="se"&gt;\&lt;/span&gt;
  | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s2"&gt;".[] | &lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;- [&lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="s2"&gt;.title)](&lt;/span&gt;&lt;span class="se"&gt;\(&lt;/span&gt;&lt;span class="s2"&gt;.url))&lt;/span&gt;&lt;span class="se"&gt;\"&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;done&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/rank-report.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Playwright and Selenium can't do this — not because it's technically impossible, but because it's not what they were designed for. They weren't built for "quickly execute one-off tasks."&lt;/p&gt;

&lt;h2&gt;
  
  
  My Recommendation
&lt;/h2&gt;

&lt;p&gt;If you're a &lt;strong&gt;developer&lt;/strong&gt; whose daily work involves browser operations but not testing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use CLI tools for 80% of your daily browser tasks&lt;/li&gt;
&lt;li&gt;Introduce Playwright only when you genuinely need testing&lt;/li&gt;
&lt;li&gt;Don't use Playwright for things CLI should handle — you'll just create maintenance burden&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're a &lt;strong&gt;QA engineer&lt;/strong&gt; whose primary job is testing:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Playwright or Selenium should be your primary tools&lt;/li&gt;
&lt;li&gt;But for occasional non-testing browser operations, CLI tools save a lot of time&lt;/li&gt;
&lt;li&gt;They complement each other. No conflict.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If you're an &lt;strong&gt;ops / content creator / SEO engineer&lt;/strong&gt; whose daily work is repetitive browser operations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;CLI automation tools are your best friend&lt;/li&gt;
&lt;li&gt;Combined with cron and shell scripts, you can automate your entire daily workflow&lt;/li&gt;
&lt;li&gt;No need to learn a test framework. The command line is enough.&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;If you're looking for a CLI browser automation tool, &lt;a href="https://github.com/yanqdinho/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; is worth checking out. It wraps Playwright's browser control capabilities into a command-line interface — Google search, web scraping, SEO ping, all done in one line, with native JSON output and pipe support. Built for daily automation workflows and cron-friendly scheduled tasks.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>testing</category>
      <category>webdev</category>
      <category>discuss</category>
    </item>
    <item>
      <title>My Web Scraper Died at 3 AM Because of reCAPTCHA</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 15:46:31 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/my-web-scraper-died-at-3-am-because-of-recaptcha-2191</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/my-web-scraper-died-at-3-am-because-of-recaptcha-2191</guid>
      <description>&lt;p&gt;3:17 AM. My phone buzzed.&lt;/p&gt;

&lt;p&gt;It was an alert from the monitoring system — the scheduled web scraper had crashed.&lt;/p&gt;

&lt;p&gt;I grabbed my laptop, VPN'd into the server, and pulled the logs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Error: ElementClickInterceptedException: element click intercepted
  by iframe element: &amp;lt;iframe src="https://www.google.com/recaptcha/..."&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The page was stuck on a reCAPTCHA challenge.&lt;/p&gt;

&lt;p&gt;Honestly, this wasn't the first time. Last month it was hCaptcha. The month before that, Cloudflare Turnstile. Same script every time: the target site upgrades their bot detection, my scraper gets caught off guard, data collection stops, and the downstream pipeline breaks.&lt;/p&gt;

&lt;p&gt;I stared at that CAPTCHA widget on the screen and thought: &lt;strong&gt;how am I supposed to deal with this?&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 1: Go Stealth — Hide Your Automation Traces
&lt;/h2&gt;

&lt;p&gt;The first thing that comes to mind: "don't let them know you're a bot."&lt;/p&gt;

&lt;p&gt;The Puppeteer community has a popular solution: &lt;code&gt;puppeteer-extra-plugin-stealth&lt;/code&gt;. It patches various fingerprints that Headless Chrome exposes — &lt;code&gt;navigator.webdriver&lt;/code&gt;, Chrome DevTools Protocol artifacts, missing plugin lists, and so on.&lt;/p&gt;

&lt;p&gt;Here's what the code looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;puppeteer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;StealthPlugin&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;puppeteer-extra-plugin-stealth&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;StealthPlugin&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://example.com/protected-page&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Pretend to be human&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;defineProperty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;webdriver&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sounds great, right? The problem is — &lt;strong&gt;it doesn't guarantee 100% effectiveness&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Cloudflare's bot detection team is continuously upgrading their systems. The maintainers of stealth plugins are essentially playing a cat-and-mouse game. You patch something today, they come up with a new detection method tomorrow.&lt;/p&gt;

&lt;p&gt;More critically, stealth only &lt;em&gt;reduces the probability&lt;/em&gt; of triggering a CAPTCHA — it doesn't eliminate it. For a scheduled task that needs to run 24/7, "it probably won't trigger" isn't good enough.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict: Useful, but not a silver bullet. Good for reducing trigger frequency, but not suitable as your only line of defense.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 2: Pay to Bypass — CAPTCHA Solving Services
&lt;/h2&gt;

&lt;p&gt;If machines can't solve it, get a "smarter machine" to do it.&lt;/p&gt;

&lt;p&gt;Services like 2Captcha, Anti-Captcha, and CapSolver work on a simple principle: you send them a screenshot of the CAPTCHA, they either use AI to solve it or dispatch it to a real human to click through, and they send you back the result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;solve&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2captcha-ts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;bypassCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;siteKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;solve&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;sitekey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;siteKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;pageurl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;url&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recaptcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#g-recaptcha-response&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;click&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#submit-button&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each solve costs about $0.001 to $0.003. Doesn't sound like much. But let's do the math:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One scraping job hits 1,000 pages&lt;/li&gt;
&lt;li&gt;30% of them trigger CAPTCHAs&lt;/li&gt;
&lt;li&gt;That's 300 × $0.003 = $0.90/day = ~$27/month&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A few dozen dollars a month is still within budget. But the real problems aren't about money:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Unstable success rates&lt;/strong&gt;: reCAPTCHA v3 scores users based on behavioral signals. The token returned by the solving service might not score high enough to pass verification.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Privacy concerns&lt;/strong&gt;: You're sending the target site's URL and site key to a third party.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High latency&lt;/strong&gt;: From submission to result, it can take 10 seconds on the fast end, or a minute or two on the slow end — seriously dragging down scraping speed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ethical gray area&lt;/strong&gt;: CAPTCHAs exist to distinguish humans from machines. Paying real humans to solve them for your bot... well, you see the issue.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Verdict: Functional, but cost, stability, and compliance are all questionable. Suitable for small-scale, non-critical use cases.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 3: Guerrilla Warfare — Rotate Through Proxy Pools
&lt;/h2&gt;

&lt;p&gt;Another approach: since too many requests from one IP triggers CAPTCHAs, just keep switching IPs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;puppeteer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proxyList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://user:pass@proxy1:8080&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://user:pass@proxy2:8080&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://user:pass@proxy3:8080&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;crawlWithProxy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;proxy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;proxyList&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;proxyList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;`--proxy-server=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;proxy&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ... scraping logic&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Anyone who's used proxy pools knows the drill:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Free proxies&lt;/strong&gt;: Don't bother. If you can even connect, consider yourself lucky. Speed and reliability are essentially zero.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Paid proxies&lt;/strong&gt;: Residential proxies have the best quality but are expensive — dozens of dollars per GB. Data center proxies are cheaper but easier to detect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inconsistent quality&lt;/strong&gt;: Some IP ranges are already blacklisted by major websites. Buying them is basically throwing money away.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A friend of mine does e-commerce data collection. He spends over $2,000/month on proxies alone. He once told me, with complete sincerity: "My proxy bill is ten times my server bill."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verdict: Can reduce trigger frequency, but expensive, and effectiveness depends entirely on proxy quality. Suitable for teams with budgets.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Approach 4: Call for Help — Pause on CAPTCHA, Let a Human Handle It
&lt;/h2&gt;

&lt;p&gt;The three approaches above share a common flaw: &lt;strong&gt;they all try to "beat" the CAPTCHA&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;But think about it differently — CAPTCHAs exist for a reason. They're designed to tell humans and machines apart. So why not let a &lt;strong&gt;human&lt;/strong&gt; handle it?&lt;/p&gt;

&lt;p&gt;This is the core idea behind "human-in-the-loop" automation:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The automation script runs normally&lt;/li&gt;
&lt;li&gt;When a CAPTCHA is detected, &lt;strong&gt;pause immediately&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Notify a human (send a message, push a notification, open a preview link)&lt;/li&gt;
&lt;li&gt;The human manually completes the CAPTCHA in the browser&lt;/li&gt;
&lt;li&gt;The automation script resumes&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  CAPTCHA Detection Logic
&lt;/h3&gt;

&lt;p&gt;The first step is knowing when a CAPTCHA appears on the page. The detection logic is actually straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CAPTCHA_SELECTORS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iframe[src*="recaptcha"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iframe[src*="hcaptcha"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iframe[src*="challenges.cloudflare.com"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;iframe[src*="turnstile"]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.g-recaptcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.h-captcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;#captcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;div[data-sitekey]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;selector&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;CAPTCHA_SELECTORS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;box&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;element&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;boundingBox&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;detected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;guessCaptchaType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
          &lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;box&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// selector didn't match, skip&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;detected&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;guessCaptchaType&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;recaptcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reCAPTCHA&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hcaptcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hCaptcha&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;cloudflare&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;selector&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;turnstile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Cloudflare Turnstile&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unknown&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This covers the major CAPTCHA providers: Google reCAPTCHA, hCaptcha, and Cloudflare Turnstile. It uses iframe &lt;code&gt;src&lt;/code&gt; attributes and class names for feature matching — simple but effective.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Happens After Detection?
&lt;/h3&gt;

&lt;p&gt;Once a CAPTCHA is detected, the key is to let a real person &lt;strong&gt;see the current page&lt;/strong&gt; and &lt;strong&gt;manually interact&lt;/strong&gt; with it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;handleCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;captcha&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;captcha&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;detected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[CAPTCHA] &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;captcha&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; detected. Pausing for human intervention...`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Generate a live preview URL&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;previewUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`http://localhost:9222/devtools/inspector.html?ws=localhost:9222`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Send notification (Slack, Telegram, Discord, etc.)&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendNotification&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;CAPTCHA detected — manual intervention needed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Type: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;captcha&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\nPreview: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;previewUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;urgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Poll until CAPTCHA disappears (human has solved it)&lt;/span&gt;
  &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;check&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;detectCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;check&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;detected&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[CAPTCHA] CAPTCHA resolved. Resuming...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The Complete Workflow
&lt;/h3&gt;

&lt;p&gt;Embed the detection logic into your normal scraping flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;smartCrawl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt; &lt;span class="c1"&gt;// Note: non-headless mode&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// Check for CAPTCHA after every page load&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;blocked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;handleCaptcha&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;blocked&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// CAPTCHA handled, reload the page&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reload&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Normal scraping logic&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;h1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.price&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;saveData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;randomInt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt; &lt;span class="c1"&gt;// Random delay to mimic human behavior&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why This Approach Works
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;100% success rate&lt;/strong&gt;: A real human solves the CAPTCHA. There's no such thing as "recognition failure."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero extra cost&lt;/strong&gt;: No proxy pools, no CAPTCHA solving services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliant&lt;/strong&gt;: A real person is operating the browser. Nothing is being "bypassed."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilient&lt;/strong&gt;: No matter how the target site upgrades their anti-bot system, the final step is always handled by a human.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Of course, the caveat is: &lt;strong&gt;this approach requires that your use case doesn't need to be fully unattended&lt;/strong&gt;. For most small-to-medium scraping tasks, the occasional need for human intervention is perfectly acceptable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Back to That 3 AM Alert
&lt;/h2&gt;

&lt;p&gt;Honestly, I don't really worry about CAPTCHAs anymore.&lt;/p&gt;

&lt;p&gt;My approach is simple: &lt;strong&gt;stealth plugin as the first line of defense to reduce trigger frequency, and when a CAPTCHA does appear, the script pauses and sends me a notification. I wake up, tap through it, and the scraper continues&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It's cheaper than paying for CAPTCHA solving services, less hassle than maintaining proxy pools, and way more reliable than trying to outsmart the CAPTCHA.&lt;/p&gt;

&lt;p&gt;CAPTCHAs exist for a reason. They protect websites from being overwhelmed by malicious bots. That's a valid design goal. Instead of trying to "beat" them, make your automation smart enough to know when to &lt;strong&gt;ask for help&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After all, &lt;strong&gt;the best code isn't code that can solve every problem — it's code that knows when to get a human involved&lt;/strong&gt;.&lt;/p&gt;




&lt;p&gt;If you want to try this "human-in-the-loop" approach, check out &lt;a href="https://github.com/xuyingzhou/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; — it has built-in CAPTCHA detection and live preview. When a CAPTCHA is detected, it automatically pauses and generates a preview link so you can take over with one click.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>monitoring</category>
      <category>programming</category>
      <category>webscraping</category>
    </item>
    <item>
      <title>I Just Wanted to Scrape One Page. Why Did I Write 50 Lines of Puppeteer?</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 15:37:25 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/i-just-wanted-to-scrape-one-page-why-did-i-write-50-lines-of-puppeteer-2mfa</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/i-just-wanted-to-scrape-one-page-why-did-i-write-50-lines-of-puppeteer-2mfa</guid>
      <description>&lt;p&gt;Last Friday at 4:30 PM, my product manager walked over: "Hey, can you grab the titles from the Hacker News homepage and send me an Excel file?"&lt;/p&gt;

&lt;p&gt;I thought: That's it? Five minutes tops.&lt;/p&gt;

&lt;p&gt;Two hours later, I was still debugging CSS selectors.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Things Spiraled Out of Control
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1: Initialize the Project
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;hacker-news-scraper &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;hacker-news-scraper
npm init &lt;span class="nt"&gt;-y&lt;/span&gt;
npm &lt;span class="nb"&gt;install &lt;/span&gt;puppeteer
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hit enter, waited three minutes. Puppeteer needs to download a full Chromium browser — over 200 MB. I stared at the progress bar and started questioning my life choices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 2: Write the Code
&lt;/h3&gt;

&lt;p&gt;"It's just a &lt;code&gt;document.querySelectorAll&lt;/code&gt;, right?" That's what I thought. Then I opened my editor:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;require&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;puppeteer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;puppeteer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--no-sandbox&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--disable-setuid-sandbox&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://news.ycombinator.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;networkidle2&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.titleline &amp;gt; a&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;titles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelectorAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.titleline &amp;gt; a&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt;
      &lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;titles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Scraping failed:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;finally&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;})();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I counted: 27 lines. And this is the minimal version — no User-Agent spoofing, no retry logic, no proxy support, no concurrency control. Add all of that and you're well past 50 lines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 3: Run It
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;node index.js
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error: &lt;code&gt;Navigation timeout of 30000 ms exceeded&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Switched to &lt;code&gt;domcontentloaded&lt;/code&gt;, got past that. But then &lt;code&gt;waitForSelector&lt;/code&gt; timed out — because &lt;code&gt;.titleline&lt;/code&gt; was a relatively new class name. Hacker News had silently changed it from &lt;code&gt;.storylink&lt;/code&gt; at some point, and nobody sent me the memo.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 4: Debug
&lt;/h3&gt;

&lt;p&gt;Set &lt;code&gt;headless: false&lt;/code&gt;, watched the browser open. Oh right, the selector did change. Fixed it, ran it again, finally got results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Wrap Up
&lt;/h3&gt;

&lt;p&gt;Formatted the data as CSV, sent it to the PM. Then deleted the project directory — because I knew the next time someone wanted to scrape a different website, none of this code would be reusable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Total time: two hours.&lt;/strong&gt; For 30 titles.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Is "Simple" Browser Scraping So Complicated?
&lt;/h2&gt;

&lt;p&gt;Let's think about this calmly. Where does the complexity come from?&lt;/p&gt;

&lt;h3&gt;
  
  
  The Framework Is Overkill
&lt;/h3&gt;

&lt;p&gt;Puppeteer and Playwright are, at their core, &lt;strong&gt;browser testing frameworks&lt;/strong&gt;. They're designed for developers writing complex E2E test suites — simulating user logins, filling out forms, verifying page states. Scraping webpage titles? That's maybe 1% of what they can do, but you pay the price for the other 99%.&lt;/p&gt;

&lt;p&gt;Installing Puppeteer literally installs an entire browser on your machine. It's like wanting to open a can of soup and having to assemble an entire kitchen first.&lt;/p&gt;

&lt;h3&gt;
  
  
  Starting from Scratch Every Time
&lt;/h3&gt;

&lt;p&gt;I wrote a scraper for Hacker News. Can I reuse it for Reddit? Nope. Different selectors, different loading strategies, different anti-bot measures. Every website is a brand new adventure.&lt;/p&gt;

&lt;p&gt;There's no "I scraped this site before" memory, no universal selector strategy, no ability to automatically adapt when pages change. Every single time, you start from zero.&lt;/p&gt;

&lt;h3&gt;
  
  
  The async/await Marathon
&lt;/h3&gt;

&lt;p&gt;Look at any Puppeteer script — it's a sea of &lt;code&gt;await&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;launch&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;evaluate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every single operation is asynchronous. Every one needs &lt;code&gt;await&lt;/code&gt;. I'm not saying async is bad — browser operations genuinely need to be async. But for an "open page, grab data" task, the cognitive overhead is excessive.&lt;/p&gt;

&lt;h3&gt;
  
  
  Error Handling Explosion
&lt;/h3&gt;

&lt;p&gt;Timeouts, missing elements, network errors, page redirects, SSL errors… every step can fail, every step needs a try-catch. A robust scraping script often has more error handling code than actual business logic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TimeoutError&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Retry with a different waitUntil strategy?&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Actually broken?&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForSelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;10000&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Selector changed? Page not loaded? Blocked by anti-bot?&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You think you're scraping data, but you're actually writing an error-handling framework.&lt;/p&gt;

&lt;h3&gt;
  
  
  Not Reusable
&lt;/h3&gt;

&lt;p&gt;Switch to a different website and everything changes — selectors, loading strategies, anti-bot mechanisms. The only reusable part from your last script is the &lt;code&gt;puppeteer.launch()&lt;/code&gt; boilerplate. Everything else gets rewritten.&lt;/p&gt;

&lt;p&gt;It's like having to reinvent the knife every time you want to cook a meal.&lt;/p&gt;

&lt;h2&gt;
  
  
  What If Browser Operations Were as Simple as curl?
&lt;/h2&gt;

&lt;p&gt;curl is beautifully simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.github.com/users/octocat | jq &lt;span class="s1"&gt;'.login'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One line, you get your data. But curl has a fatal flaw: &lt;strong&gt;it doesn't execute JavaScript&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;It's 2026. A huge number of websites are client-side rendered. When you curl them, you get an empty HTML shell and a bunch of &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tags. The actual data only appears after a browser executes the JavaScript.&lt;/p&gt;

&lt;p&gt;So what we need is a &lt;strong&gt;curl that can execute JavaScript&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not a testing framework. Not a browser automation library. Just a command-line tool. You give it a command, it gives you data. Done.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Can One Line Do?
&lt;/h2&gt;

&lt;p&gt;Let's go back to the Hacker News titles scenario:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser scrape https://news.ycombinator.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. The page content in Markdown format goes straight to your terminal.&lt;/p&gt;

&lt;p&gt;Only want the titles? Add a selector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser goto https://news.ycombinator.com , text &lt;span class="nt"&gt;--selector&lt;/span&gt; &lt;span class="s2"&gt;".titleline"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Want JSON output?&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser goto https://news.ycombinator.com , text &lt;span class="nt"&gt;--selector&lt;/span&gt; &lt;span class="s2"&gt;".titleline"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No &lt;code&gt;npm init&lt;/code&gt;. No &lt;code&gt;async/await&lt;/code&gt;. No try-catch. One command, results come out.&lt;/p&gt;

&lt;h3&gt;
  
  
  Search Engine Results
&lt;/h3&gt;

&lt;p&gt;PM says: "Check where our company ranks on Google for 'AI agent'."&lt;/p&gt;

&lt;p&gt;The traditional approach? Fire up Puppeteer, simulate a search, parse the SERP page, handle Google's dynamic loading… another 50 lines right there.&lt;/p&gt;

&lt;p&gt;Now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser search &lt;span class="s2"&gt;"AI agent"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--limit&lt;/span&gt; 10 &lt;span class="nt"&gt;--full&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns titles, URLs, and summaries. Supports Google, Bing, Baidu, DuckDuckGo — multiple engines out of the box.&lt;/p&gt;

&lt;h3&gt;
  
  
  Screenshots
&lt;/h3&gt;

&lt;p&gt;"Take a screenshot of this page."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser goto https://news.ycombinator.com , screenshot &lt;span class="nt"&gt;--full-page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full-page screenshot. No need to worry about browser window size, lazy-loaded images, or viewport settings.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fill and Submit Forms
&lt;/h3&gt;

&lt;p&gt;"Test the signup flow."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser goto https://example.com/signup , fill &lt;span class="s2"&gt;"#email"&lt;/span&gt; &lt;span class="s2"&gt;"test@example.com"&lt;/span&gt; , fill &lt;span class="s2"&gt;"#password"&lt;/span&gt; &lt;span class="s2"&gt;"123456"&lt;/span&gt; , click &lt;span class="s2"&gt;"#submit"&lt;/span&gt; , screenshot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Comma-separated command chain, one line. As natural as writing a shell pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  Monitor Page Changes
&lt;/h3&gt;

&lt;p&gt;"Notify me when this price drops below 500."&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;do
  &lt;/span&gt;xbrowser text &lt;span class="nt"&gt;--selector&lt;/span&gt; &lt;span class="s2"&gt;".price"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-q&lt;/span&gt; &lt;span class="s2"&gt;"^4[0-9][0-9]$"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; notify-send &lt;span class="s2"&gt;"Price dropped!"&lt;/span&gt;
  &lt;span class="nb"&gt;sleep &lt;/span&gt;3600
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Integrates naturally with cron, shell scripts, CI/CD pipelines. Because it's a command-line tool, not an API library.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's Not Just About "Simple"
&lt;/h2&gt;

&lt;p&gt;You might be thinking: Isn't this just Puppeteer wrapped in a CLI?&lt;/p&gt;

&lt;p&gt;Not quite. There's a &lt;strong&gt;fundamentally different philosophy&lt;/strong&gt; behind this.&lt;/p&gt;

&lt;h3&gt;
  
  
  Waterfall vs. Faucet
&lt;/h3&gt;

&lt;p&gt;Puppeteer and Playwright are like a waterfall — powerful, but you have to stand underneath to collect water, and you'll get drenched in the process. You have to manage async operations, handle lifecycles, write boilerplate.&lt;/p&gt;

&lt;p&gt;A CLI tool should be like a faucet — turn it on, water comes out. Turn it off, it stops. Simple, direct, on-demand.&lt;/p&gt;

&lt;h3&gt;
  
  
  Framework vs. Tool
&lt;/h3&gt;

&lt;p&gt;A framework demands you think its way. You must understand its conceptual model: Browser → Page → Frame → Element, each step is async, each step can fail.&lt;/p&gt;

&lt;p&gt;A tool should think your way. What do you want? "Open this page" — &lt;code&gt;goto&lt;/code&gt;. "Get this text" — &lt;code&gt;text&lt;/code&gt;. "Take a screenshot" — &lt;code&gt;screenshot&lt;/code&gt;. Simple as that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Programming Interface vs. Command Interface
&lt;/h3&gt;

&lt;p&gt;The flexibility of a programming interface (API) is irreplaceable — complex automation scenarios genuinely need fine-grained control. But for 80% of "open a page, grab some data" use cases, a command interface (CLI) is 10x more efficient.&lt;/p&gt;

&lt;p&gt;Think of it like Git: you &lt;em&gt;can&lt;/em&gt; use libgit2 to write a program that manipulates your repository, but most of the time you just run &lt;code&gt;git commit -m "xxx"&lt;/code&gt; and call it a day.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use What?
&lt;/h2&gt;

&lt;p&gt;To be clear: I'm not saying Puppeteer or Playwright are bad. They're incredibly powerful in their domain. The problem is using them for the wrong jobs.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Scenario&lt;/th&gt;
&lt;th&gt;Recommended Tool&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scrape one page's data&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Extract search engine results&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Quick screenshot&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Integrate with shell scripts&lt;/td&gt;
&lt;td&gt;CLI&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex E2E test suites&lt;/td&gt;
&lt;td&gt;Playwright&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fine-grained browser control&lt;/td&gt;
&lt;td&gt;Puppeteer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Performance testing&lt;/td&gt;
&lt;td&gt;Lighthouse / k6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Large-scale crawling systems&lt;/td&gt;
&lt;td&gt;Scrapy / Custom&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Tools should fit the scenario, not the other way around. Using a sledgehammer to drive a nail isn't the hammer's fault — it's yours.&lt;/p&gt;

&lt;h2&gt;
  
  
  Back to That Friday Afternoon
&lt;/h2&gt;

&lt;p&gt;If I'd had this tool back then, my Friday would have gone like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser scrape https://news.ycombinator.com &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; hn.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three seconds. Then I'd toss the Markdown file to the PM and get back to my actual work.&lt;/p&gt;

&lt;p&gt;Not because the technology is revolutionary, but because &lt;strong&gt;the tool matches the scale of the problem&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Scraping one page's titles should never require a full project setup.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I built &lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; to solve exactly this — a tool that turns browser operations into command-line commands. If you're also tired of writing full projects for one-off scraping tasks, give it a try.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>productivity</category>
      <category>beginners</category>
    </item>
    <item>
      <title>My AI Agent Burned 26K Tokens Doing the Same Browser Task 10 Times</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 15:04:23 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/my-ai-agent-burned-26k-tokens-doing-the-same-browser-task-10-times-emp</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/my-ai-agent-burned-26k-tokens-doing-the-same-browser-task-10-times-emp</guid>
      <description>&lt;p&gt;Last week I ran an experiment: I asked my AI agent to search Juejin for "Rust tutorial" articles — once a day, for 10 days straight.&lt;/p&gt;

&lt;p&gt;The result? &lt;strong&gt;It consumed roughly 26,000 tokens on browser operations alone.&lt;/strong&gt; And every single day, the execution flow was identical: find the search box, type the keyword, wait for results, extract text. The AI treated each session like its first visit, fumbling from scratch every time.&lt;/p&gt;

&lt;p&gt;I did the math. At GPT-4o pricing, that's about &lt;strong&gt;/bin/zsh.40–0.60&lt;/strong&gt; wasted. Not a fortune — but the real pain is that &lt;strong&gt;every token was spent on pure, unadulterated repetition.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Do AI Agents Burn So Many Tokens on Browser Tasks?
&lt;/h2&gt;

&lt;p&gt;Let's look at a typical scenario. I ask my AI agent to search CSDN for an article title:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First attempt (exploration phase):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Agent] Screenshots the page to analyze layout...... ~800 tokens
[Agent] Tries to locate search box → guesses #search-box... ~600 tokens
[Agent] Selector doesn't exist, inspects DOM to find the real one... ~500 tokens
[Agent] Found it: input[placeholder="搜CSDN"]
[Agent] Types search term, waits for page load... ~400 tokens
[Agent] Extracts search result list text... ~300 tokens
Total: ~2,600 tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Second attempt (exact same task):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Agent] Screenshots the page to analyze layout...... ~800 tokens (again)
[Agent] Locates search box... ~600 tokens (guessing again)
[Agent] Guesses wrong once... ~500 tokens (retrying again)
[Agent] Types, waits, extracts... ~700 tokens
Total: ~2,600 tokens
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;10 iterations = &lt;strong&gt;26,000 tokens&lt;/strong&gt;, with nearly identical execution paths every time.&lt;/p&gt;

&lt;p&gt;I broke down the problem into four root causes:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. No Memory Between Sessions
&lt;/h3&gt;

&lt;p&gt;AI agents have zero cross-session memory. Yesterday it figured out that &lt;code&gt;input[placeholder="探索"]&lt;/code&gt; is Juejin's search box. Today? Gone. Forgotten. That same exploration process repeats, day after day.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Selector Guessing Is a Probabilistic Game
&lt;/h3&gt;

&lt;p&gt;When AI analyzes page structure — usually via screenshots or DOM snapshots — &lt;strong&gt;finding the right CSS selector is essentially guesswork.&lt;/strong&gt; Lucky guess? Great. Wrong guess? That's a few hundred tokens down the drain per retry. From my measurements, on complex pages the AI averages &lt;strong&gt;2–3 attempts&lt;/strong&gt; before landing on the correct selector.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Context Bloat from Page Data
&lt;/h3&gt;

&lt;p&gt;A typical webpage's HTML runs tens of kilobytes. A single screenshot is hundreds of kilobytes. Every browser operation shoves this data into the AI's context window. &lt;strong&gt;Just transmitting page structure costs 800–1,500 tokens per operation.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Zero Reusability
&lt;/h3&gt;

&lt;p&gt;The results of the first exploration — selectors, workflows, error-handling experience — vanish entirely. &lt;strong&gt;Every session is Groundhog Day. Every time, the AI pays the same tuition.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Insight: Turn Exploration Results into Reusable Commands
&lt;/h2&gt;

&lt;p&gt;Once I understood the problem, the solution felt almost embarrassingly obvious:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Let the AI explore once. Record what it learns (selectors, flows). Invoke the recorded command every time after.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Concrete example. After the first Juejin search, I recorded this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Juejin Search:
&lt;span class="p"&gt;-&lt;/span&gt; Search box selector: input[placeholder="探索"]
&lt;span class="p"&gt;-&lt;/span&gt; Result list: .content-main .entry
&lt;span class="p"&gt;-&lt;/span&gt; Flow: click search box → type keyword → press Enter → wait for load → extract text
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then I wrapped it into a CLI command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;juejin search &lt;span class="s2"&gt;"Rust 教程"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;From that point on, the AI just constructs this command string. No more page analysis.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Token cost comparison:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Per-operation tokens&lt;/th&gt;
&lt;th&gt;10x total&lt;/th&gt;
&lt;th&gt;Repetition rate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;AI operates browser directly&lt;/td&gt;
&lt;td&gt;~2,600&lt;/td&gt;
&lt;td&gt;~26,000&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Calls pre-built command&lt;/td&gt;
&lt;td&gt;~50&lt;/td&gt;
&lt;td&gt;~500&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;From 26,000 down to 500. A 98% reduction.&lt;/strong&gt; That 50 tokens is just the cost of the AI assembling the command string.&lt;/p&gt;

&lt;h2&gt;
  
  
  In Practice: From 3,000+ Tokens Down to 50
&lt;/h2&gt;

&lt;p&gt;Let me walk through a real scenario — "AI agent monitors the price of a JD.com product daily."&lt;/p&gt;

&lt;h3&gt;
  
  
  Before: AI manually operates the browser (~3,000 tokens/time)
&lt;/h3&gt;

&lt;p&gt;The AI generates code like this every single time:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Code the AI generates fresh every session&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;browser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;newPage&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;goto&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://item.jd.com/100012043978.html&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;domcontentloaded&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// AI guesses selectors... attempt 1&lt;/span&gt;
&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.p-price span&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// attempt 2&lt;/span&gt;
  &lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.price J-p-100012043978&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// attempt 3, inspect DOM...&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;html&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;content&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="c1"&gt;// parse HTML structure, re-locate...&lt;/span&gt;
  &lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[class*="price"] span&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;textContent&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// handle anti-scraping, wait for dynamic loading, catch errors...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code looks reasonable, right? The problem is &lt;strong&gt;the AI regenerates this entire flow from scratch every time.&lt;/strong&gt; And in practice, AI-generated code is even more bloated — it adds excessive try-catch blocks, verbose comments, debug logging, because it's "thinking out loud" as it writes.&lt;/p&gt;

&lt;h3&gt;
  
  
  After: Wrapped as a command (~50 tokens/time)
&lt;/h3&gt;

&lt;p&gt;I packaged common browser operations into plugins. Now the AI agent just calls a CLI command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# The AI only needs to generate this one line&lt;/span&gt;
xbrowser jd price &lt;span class="nt"&gt;--url&lt;/span&gt; &lt;span class="s2"&gt;"https://item.jd.com/100012043978.html"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2999.00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"originalPrice"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3299.00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"discount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"-9.1%"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"inStock"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-05-28T10:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;50 tokens. Done.&lt;/strong&gt; The AI doesn't need to know JD's price selector. It doesn't need anti-scraping logic. It doesn't need to wait for dynamic content. The plugin handles all the dirty work internally.&lt;/p&gt;

&lt;h3&gt;
  
  
  Another example: searching content platforms
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Before: AI operates browser to search Juejin articles, ~2,600 tokens&lt;/span&gt;
&lt;span class="c"&gt;# After:&lt;/span&gt;
xbrowser juejin search &lt;span class="s2"&gt;"Rust async programming"&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 5 &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"results"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rust 异步编程完全指南"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"xxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://juejin.cn/post/7xxx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"likes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;328&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-05-25"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Clean. Structured. Predictable.&lt;/strong&gt; The AI gets JSON it can directly reason over — no more scraping text out of HTML.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Deeper Principle: Amortize Exploration Cost into Reusable Assets
&lt;/h2&gt;

&lt;p&gt;After using this approach for a month, I've arrived at a simple formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Total Token Cost = First Exploration Cost + Repeat Count × Per-operation Cost

Without encapsulation:  Total = 2,600 + N × 2,600
With encapsulation:     Total = 2,600 + N × 50

When N = 10:  saves 98%
When N = 50:  saves 99.6%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;That initial 2,600 tokens is a necessary investment&lt;/strong&gt; — the AI has to figure out the page structure at least once. The key insight: you only pay this tax &lt;strong&gt;once.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here's an analogy: &lt;strong&gt;AI agents doing browser tasks is like hiring someone new to order lunch from the same restaurant every day, but they re-read the menu, ask the waiter for recommendations, and agonize over the choice every single time.&lt;/strong&gt; Encapsulating commands is like letting them memorize their regular order. Next time, they just say the dish name.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plugin Architecture: One Plugin Per Website
&lt;/h2&gt;

&lt;p&gt;The natural extension of this approach is &lt;strong&gt;encapsulating by website.&lt;/strong&gt; Each plugin packages operations for a specific site, and the AI agent calls them on demand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Search engines&lt;/span&gt;
xbrowser baidu search &lt;span class="s2"&gt;"AI Agent browser automation"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
xbrowser google search &lt;span class="s2"&gt;"playwright token cost"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
xbrowser bing search &lt;span class="s2"&gt;"CDP protocol guide"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;

&lt;span class="c"&gt;# Content platforms&lt;/span&gt;
xbrowser juejin search &lt;span class="s2"&gt;"Rust 教程"&lt;/span&gt; &lt;span class="nt"&gt;--limit&lt;/span&gt; 10 &lt;span class="nt"&gt;--json&lt;/span&gt;
xbrowser csdn search &lt;span class="s2"&gt;"TypeScript generics"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
xbrowser zhihu search &lt;span class="s2"&gt;"how to learn system design"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;

&lt;span class="c"&gt;# E-commerce&lt;/span&gt;
xbrowser jd price &lt;span class="nt"&gt;--url&lt;/span&gt; &lt;span class="s2"&gt;"https://item.jd.com/xxx"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each plugin is an npm package containing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Selector definitions&lt;/strong&gt; — which CSS selectors this website uses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Operation flows&lt;/strong&gt; — what to do first, what next, how to handle errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data extraction&lt;/strong&gt; — how to transform raw HTML into structured JSON&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The AI doesn't need to know any of this.&lt;/strong&gt; It just needs to know "there's a command for that."&lt;/p&gt;

&lt;h2&gt;
  
  
  Scaling the Approach: Beyond Token Savings
&lt;/h2&gt;

&lt;p&gt;Once you start thinking this way, the benefits compound beyond just token efficiency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Reliability
&lt;/h3&gt;

&lt;p&gt;An AI guessing selectors will fail sometimes. A plugin with hardcoded selectors doesn't guess — it &lt;strong&gt;knows.&lt;/strong&gt; When a website changes its layout, you update one plugin file instead of hoping the AI figures it out on the next attempt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Speed
&lt;/h3&gt;

&lt;p&gt;Constructing a CLI command takes the AI one inference step. Operating a browser manually takes 5–10 inference steps (screenshot → analyze → guess → retry → act). &lt;strong&gt;That's not just token savings — it's latency savings.&lt;/strong&gt; Your agent responds faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Composability
&lt;/h3&gt;

&lt;p&gt;Once operations are CLI commands, you can compose them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Search multiple platforms and compare results&lt;/span&gt;
xbrowser baidu search &lt;span class="s2"&gt;"Rust async"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; baidu.json
xbrowser google search &lt;span class="s2"&gt;"Rust async"&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; google.json
diff &amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;jq &lt;span class="s1"&gt;'.results[].title'&lt;/span&gt; baidu.json&lt;span class="o"&gt;)&lt;/span&gt; &amp;lt;&lt;span class="o"&gt;(&lt;/span&gt;jq &lt;span class="s1"&gt;'.results[].title'&lt;/span&gt; google.json&lt;span class="o"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try doing that with raw browser automation. You'd need the AI to orchestrate multiple browser sessions, manage context windows, and merge results. &lt;strong&gt;With commands, it's just shell scripting.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Auditability
&lt;/h3&gt;

&lt;p&gt;When something goes wrong, you can inspect the command and its output directly. No need to replay the AI's entire chain-of-thought to figure out where the selector guess went wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Some Broader Reflections
&lt;/h2&gt;

&lt;p&gt;Here's what I think is the deeper issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The biggest cost of AI agents isn't the API bill — it's the waste of repetitive labor.&lt;/strong&gt; A single browser operation costing 3,000 tokens isn't expensive. Repeating it 100 times is 300,000 tokens. And worse, these repetitions have &lt;strong&gt;zero learning value&lt;/strong&gt; — the AI explores the same page every time but doesn't get any smarter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Converting one-time exploration cost into reusable assets is the right engineering direction for AI agents.&lt;/strong&gt; It's the same logic behind every abstraction we already use: we don't manually configure environments every deployment, so we have Docker. We don't manually compile every build, so we have CI/CD. AI agent browser operations need the same "encapsulation" mindset.&lt;/p&gt;

&lt;p&gt;Imagine if every commonly-used website had a pre-built command library — GitHub operations, Jira queries, Slack message reads, Confluence page fetches. How much would AI agent token efficiency improve? How much more reliable would agents become? (Pre-built commands don't guess wrong selectors.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;That's the leap from "AI agents that sort of work" to "AI agents that actually work reliably."&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you're building browser automation for AI agents, I'd suggest trying this encapsulation approach. I've been using &lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; — it's an open-source CLI tool purpose-built for this pattern. Plugin-based, per-website commands, and it's on GitHub if you want to take a look.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>webdev</category>
      <category>automation</category>
    </item>
    <item>
      <title>I Replaced 50-Line Puppeteer Scripts with Single CLI Commands</title>
      <dc:creator>许映洲</dc:creator>
      <pubDate>Thu, 28 May 2026 00:48:33 +0000</pubDate>
      <link>https://dev.to/_ab214f84f83a01455a74b/i-replaced-50-line-puppeteer-scripts-with-single-cli-commands-3jlc</link>
      <guid>https://dev.to/_ab214f84f83a01455a74b/i-replaced-50-line-puppeteer-scripts-with-single-cli-commands-3jlc</guid>
      <description>&lt;h1&gt;
  
  
  I Replaced 50-Line Puppeteer Scripts with Single CLI Commands â€” Here's How
&lt;/h1&gt;

&lt;p&gt;Last month I spent 3 hours debugging a Puppeteer script. The task? Go to Hacker News, click the top story, scrape the content, and save it. That's it. Three actions. Three hours â€” because the selector changed, the page loaded slower than my timeout, and the async/await chain threw an error I couldn't reproduce locally.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F11t302dxq04mooi0ivmt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F11t302dxq04mooi0ivmt.png" alt="Hero - Terminal browser automation" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;That's when I decided to build &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser&lt;/a&gt; â€” a CLI tool that treats browser automation as commands, not code. One line to search Google. One line to scrape any page. One line to chain a complete multi-step workflow. No scripts. No async management. No boilerplate. Think of it as the &lt;strong&gt;web scraping CLI&lt;/strong&gt; that sits between Playwright and curl â€” purpose-built for developers who need to interact with the web, not test it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# What took me 50 lines of Puppeteer&lt;/span&gt;
xbrowser &lt;span class="s2"&gt;"goto https://news.ycombinator.com , click '.titleline &amp;gt; a' , text"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. One command. Readable. Replayable. Done.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Couldn't Keep Using Puppeteer and Playwright
&lt;/h2&gt;

&lt;p&gt;Don't get me wrong â€” Playwright is incredible for testing. Selenium pioneered the space. But I'm not testing web apps. I'm &lt;em&gt;using&lt;/em&gt; the web: scraping competitors, checking SEO rankings, automating my social media posting, monitoring price changes. And &lt;strong&gt;every headless browser tutorial&lt;/strong&gt; I found was about testing, not about building real web scraping pipelines or automation workflows.&lt;/p&gt;

&lt;p&gt;For those tasks, the testing tools feel like using a sledgehammer to hang a picture frame:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The setup tax is real.&lt;/strong&gt; Every new task means &lt;code&gt;npm init&lt;/code&gt;, install dependencies, download a browser, write boilerplate. I just want to scrape one page â€” why do I need a project?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scripts don't compose.&lt;/strong&gt; My "scrape Hacker News" script doesn't help me "scrape Reddit." The selectors are different, the structure is different, but the core operation is the same: go to URL, extract content.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI agents can't use them.&lt;/strong&gt; I build AI tools, and giving an LLM a 50-line async script to manage is a recipe for hallucinated selectors and broken promises. Agents need simple, declarative commands.&lt;/p&gt;

&lt;p&gt;I wanted &lt;code&gt;curl&lt;/code&gt; for interactive browser tasks. So I built it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install Once, Automate Everything
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; @dyyz1993/xbrowser
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the entire setup. No WebDriver. No config files. xbrowser ships with its own managed Chromium that includes CDP fingerprint protection â€” sites can't easily detect automation.&lt;/p&gt;

&lt;p&gt;From there, you have 35+ composable commands at your fingertips.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature 1: Search the Web From Your Terminal
&lt;/h2&gt;

&lt;p&gt;No API keys. No OAuth. No rate limits. Just search.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Google&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"best headless browser 2026"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 10

&lt;span class="c"&gt;# Bing&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"best headless browser 2026"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; bing &lt;span class="nt"&gt;--num&lt;/span&gt; 10

&lt;span class="c"&gt;# Baidu (for Chinese-language results)&lt;/span&gt;
xbrowser search &lt;span class="s2"&gt;"æ—&amp;nbsp;å¤´æµè§ˆå™¨è‡ªåŠ¨åŒ–"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; baidu &lt;span class="nt"&gt;--num&lt;/span&gt; 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkai9n69mv81walukedoq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkai9n69mv81walukedoq.png" alt="Multi-engine search visualization" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each command returns structured JSON: titles, URLs, snippets. Pipe to &lt;code&gt;jq&lt;/code&gt;, save to file, or feed directly into your AI agent's context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real use case:&lt;/strong&gt; I track how my open-source project ranks across search engines. Every Monday I run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser search &lt;span class="s2"&gt;"xbrowser browser automation"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 30 &lt;span class="se"&gt;\&lt;/span&gt;
  | jq &lt;span class="s1"&gt;'.results[] | select(.url | contains("xbrowser.dev")) | .position'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Takes 5 seconds. No script. No API key. Just results.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature 2: Scrape Without Writing a Scraper
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;scrape&lt;/code&gt; command handles JavaScript rendering, lazy-loaded content, and complex layouts â€” and outputs clean Markdown by default. It's the &lt;strong&gt;web scraping tool&lt;/strong&gt; I always wished Playwright had built in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Any page â†’ clean Markdown&lt;/span&gt;
xbrowser scrape https://example.com/blog/my-article

&lt;span class="c"&gt;# Crawl an entire site (respects robots.txt)&lt;/span&gt;
xbrowser crawl https://example.com &lt;span class="nt"&gt;--depth&lt;/span&gt; 3 &lt;span class="nt"&gt;--max-pages&lt;/span&gt; 100

&lt;span class="c"&gt;# Generate a complete URL map&lt;/span&gt;
xbrowser map https://example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncefpysqhxv8mrh0vocp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fncefpysqhxv8mrh0vocp.png" alt="Web scraping concept" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I use &lt;code&gt;scrape&lt;/code&gt; daily for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Content research&lt;/strong&gt;: Scrape competitor articles â†’ feed to LLM for analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SEO auditing&lt;/strong&gt;: Map all URLs on a site, check for orphaned pages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Documentation&lt;/strong&gt;: Scrape API docs and convert to Markdown for offline reading&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Web crawler workflows&lt;/strong&gt;: Chain scrape with crawl for bulk data extraction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;crawl&lt;/code&gt; command follows internal links, respects &lt;code&gt;robots.txt&lt;/code&gt;, deduplicates URLs, and handles SPA hash routes. It's an ethical, complete &lt;strong&gt;web crawler&lt;/strong&gt; in one command.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature 3: Chain Commands â€” The Real Magic
&lt;/h2&gt;

&lt;p&gt;This is the feature that makes people say "wait, you can do that?"&lt;/p&gt;

&lt;p&gt;Instead of writing scripts, you chain operations with 6 operators (&lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;, &lt;code&gt;,&lt;/code&gt;, &lt;code&gt;+&lt;/code&gt;, &lt;code&gt;-&amp;gt;&lt;/code&gt;, &lt;code&gt;;&lt;/code&gt;, &lt;code&gt;||&lt;/code&gt;):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Go to a page, click the top link, extract text&lt;/span&gt;
xbrowser &lt;span class="s2"&gt;"goto https://news.ycombinator.com , click '.titleline &amp;gt; a:first-of-type' , text"&lt;/span&gt;

&lt;span class="c"&gt;# Complete workflow: navigate â†’ fill form â†’ submit â†’ extract&lt;/span&gt;
xbrowser &lt;span class="s2"&gt;"goto https://app.example.com/login &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  + fill '#email' 'user@example.com' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  + fill '#password' 'secret' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  + click '#login' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  -&amp;gt; wait '#dashboard' &lt;/span&gt;&lt;span class="se"&gt;\&lt;/span&gt;&lt;span class="s2"&gt;
  , screenshot --output dashboard.png"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fct61ua277yexutjz32ru.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fct61ua277yexutjz32ru.png" alt="Command chaining pipeline" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The syntax reads like natural language. &lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt; means "then and only then" (stop on error). &lt;code&gt;,&lt;/code&gt; means "do all of these." &lt;code&gt;-&amp;gt;&lt;/code&gt; means "pipe to next." &lt;code&gt;||&lt;/code&gt; means "fallback if failed."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For AI agent developers&lt;/strong&gt;, this is transformative. Instead of generating 50-line scripts, your agent constructs a single chain string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User: "Go to Hacker News, click the top story, and summarize it"

Agent builds:
xbrowser "goto https://news.ycombinator.com , click '.titleline &amp;gt; a:first-of-type' , text"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No async. No error handling boilerplate. No debugging. Just intent â†’ command â†’ result.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature 4: Record Once, Replay Forever
&lt;/h2&gt;

&lt;p&gt;Some workflows are too complex for a one-liner. That's where recording comes in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start recording (opens visible browser)&lt;/span&gt;
xbrowser record start &lt;span class="nt"&gt;--url&lt;/span&gt; https://example.com

&lt;span class="c"&gt;# Do your thing â€” click around, fill forms, navigate&lt;/span&gt;
&lt;span class="c"&gt;# xbrowser captures every action&lt;/span&gt;

&lt;span class="c"&gt;# Stop and save&lt;/span&gt;
xbrowser record stop &lt;span class="nt"&gt;--output&lt;/span&gt; my-workflow.yaml

&lt;span class="c"&gt;# Replay headlessly anytime&lt;/span&gt;
xbrowser replay my-workflow.yaml &lt;span class="nt"&gt;--headless&lt;/span&gt;

&lt;span class="c"&gt;# Export to Python, JavaScript, or Bash&lt;/span&gt;
xbrowser convert my-workflow.yaml &lt;span class="nt"&gt;--lang&lt;/span&gt; python &lt;span class="nt"&gt;--output&lt;/span&gt; workflow.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe48xh956uu6uqgwp4q8a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe48xh956uu6uqgwp4q8a.png" alt="Record and replay workflow" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I use this for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Daily standup reports&lt;/strong&gt;: Record the Jira â†’ Confluence navigation once, replay every morning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price monitoring&lt;/strong&gt;: Record a competitor's pricing page, replay daily, diff the output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Onboarding&lt;/strong&gt;: Record a complex internal tool setup, give new hires the replay script&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;convert&lt;/code&gt; command is particularly powerful â€” it auto-generates working Puppeteer/Playwright/Selenium scripts from your recorded actions. Record in the browser, ship as code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Feature 5: 68 Plugins for Every Platform
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9o61mx43ratf7b7ckic.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft9o61mx43ratf7b7ckic.png" alt="Plugin ecosystem" width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;xbrowser ships with 68 built-in plugins that encapsulate site-specific knowledge:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Plugins&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Engines&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Google, Bing, Baidu&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI Assistants&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;DeepSeek, ChatGPT, Claude, Doubao, QianWen, YuanBao&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Social Media&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Twitter/X, Reddit, Quora, Weibo, Zhihu, XiaoHongShu, Douyin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Developer&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;GitHub, Dev.to, Medium, Hashnode, CSDN, Juejin&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Image Platforms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unsplash, Pexels, Pinterest, Getty, Shutterstock, and 15 more&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;E-Commerce&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Taobao, JD, 1688&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SEO&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Backlink analysis, site audit, keyword tracking&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each plugin provides high-level commands tailored to that platform:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# GitHub: Get any user's profile&lt;/span&gt;
xbrowser github get-profile &lt;span class="nt"&gt;--username&lt;/span&gt; torvalds

&lt;span class="c"&gt;# Unsplash: Search and download images&lt;/span&gt;
xbrowser unsplash search &lt;span class="s2"&gt;"mountain sunset"&lt;/span&gt; &lt;span class="nt"&gt;--download&lt;/span&gt; first

&lt;span class="c"&gt;# Doubao: Generate AI images&lt;/span&gt;
xbrowser doubao image &lt;span class="nt"&gt;--prompt&lt;/span&gt; &lt;span class="s2"&gt;"cyberpunk city at sunset"&lt;/span&gt; &lt;span class="nt"&gt;--cdp&lt;/span&gt; 9221

&lt;span class="c"&gt;# SEO: Audit any page&lt;/span&gt;
xbrowser seo audit https://your-website.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No selectors to write. No DOM to inspect. The plugin handles the complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use xbrowser vs. Playwright vs. Selenium
&lt;/h2&gt;

&lt;p&gt;I'll be direct:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;xbrowser&lt;/th&gt;
&lt;th&gt;Playwright&lt;/th&gt;
&lt;th&gt;Selenium&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best for&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Web tasks &amp;amp; automation&lt;/td&gt;
&lt;td&gt;App testing&lt;/td&gt;
&lt;td&gt;Legacy testing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;npm i -g&lt;/code&gt; (1 step)&lt;/td&gt;
&lt;td&gt;npm + browser download&lt;/td&gt;
&lt;td&gt;npm + WebDriver&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Learning curve&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CLI commands&lt;/td&gt;
&lt;td&gt;JavaScript API&lt;/td&gt;
&lt;td&gt;Language bindings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search/Scrape&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in helpers&lt;/td&gt;
&lt;td&gt;Write it yourself&lt;/td&gt;
&lt;td&gt;Write it yourself&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Plugins&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;68 built-in&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Anti-detection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in CDP protection&lt;/td&gt;
&lt;td&gt;Third-party plugins&lt;/td&gt;
&lt;td&gt;External tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI agent friendly&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;âœ… CLI commands&lt;/td&gt;
&lt;td&gt;âŒ Scripts&lt;/td&gt;
&lt;td&gt;âŒ Scripts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Test framework&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not a test tool&lt;/td&gt;
&lt;td&gt;â­ Best in class&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Use Playwright&lt;/strong&gt; for testing your web app. &lt;strong&gt;Use xbrowser&lt;/strong&gt; for everything else.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-World Workflows I Run Daily
&lt;/h2&gt;

&lt;p&gt;Here's what my actual automation looks like:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Morning competitive check:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser search &lt;span class="s2"&gt;"xbrowser alternatives"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google &lt;span class="nt"&gt;--num&lt;/span&gt; 20 &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/competitors.json
xbrowser search &lt;span class="s2"&gt;"browser automation CLI"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; bing &lt;span class="nt"&gt;--num&lt;/span&gt; 20 &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; /tmp/competitors.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Weekly SEO audit:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser map https://xbrowser.dev &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/sitemap.txt
xbrowser seo audit https://xbrowser.dev &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; /tmp/seo-report.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Content research for blog posts:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser scrape https://competitor-blog.com/latest-post | llm summarize
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI search intelligence&lt;/strong&gt; (query 14 AI engines at once):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;xbrowser ai-search-engines &lt;span class="s2"&gt;"how to do browser automation in 2026"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each of these would be a 30-80 line Puppeteer script. With xbrowser, they're single commands I can alias, schedule with cron, or embed in AI agent workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started in 30 Seconds
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm i &lt;span class="nt"&gt;-g&lt;/span&gt; @dyyz1993/xbrowser
xbrowser search &lt;span class="s2"&gt;"hello world"&lt;/span&gt; &lt;span class="nt"&gt;--engine&lt;/span&gt; google
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Two commands and you're automating the web.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Docs &amp;amp; examples&lt;/strong&gt;: &lt;a href="https://xbrowser.dev" rel="noopener noreferrer"&gt;xbrowser.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Source code&lt;/strong&gt;: &lt;a href="https://github.com/dyyz1993/xbrowser" rel="noopener noreferrer"&gt;github.com/dyyz1993/xbrowser&lt;/a&gt; (MIT license)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;68 plugins&lt;/strong&gt;: Search, scrape, crawl, record, replay, and automate 68+ platforms&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're tired of writing 50-line scripts for tasks that should take one command â€” or if you're building AI agents that need to browse the web â€” give it a try.&lt;/p&gt;

</description>
      <category>showdev</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
