<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Nik G</title>
    <description>The latest articles on DEV Community by Nik G (@greythinkinglab).</description>
    <link>https://dev.to/greythinkinglab</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3904702%2F3205c46a-0e26-4bb7-a9d8-3a5d0acbf5d8.png</url>
      <title>DEV Community: Nik G</title>
      <link>https://dev.to/greythinkinglab</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/greythinkinglab"/>
    <language>en</language>
    <item>
      <title>I built a "polite scraper" Chrome extension instead of a server-side scraper. Here's why.</title>
      <dc:creator>Nik G</dc:creator>
      <pubDate>Fri, 01 May 2026 14:13:27 +0000</pubDate>
      <link>https://dev.to/greythinkinglab/i-built-a-polite-scraper-chrome-extension-instead-of-a-server-side-scraper-heres-why-254h</link>
      <guid>https://dev.to/greythinkinglab/i-built-a-polite-scraper-chrome-extension-instead-of-a-server-side-scraper-heres-why-254h</guid>
      <description>&lt;p&gt;Six weeks ago I started building SlotOwl — a Chrome extension that watches&lt;br&gt;
government appointment portals (visa, immigration, passport, Global Entry)&lt;br&gt;
and notifies you the moment a slot opens. This week I shipped it.&lt;/p&gt;

&lt;p&gt;This post is about ONE design decision I made early on that turned out to&lt;br&gt;
shape the whole product: I scrape inside the user's browser tab instead of&lt;br&gt;
on a server somewhere.&lt;/p&gt;

&lt;p&gt;If you're building anything that watches a third-party website on a user's&lt;br&gt;
behalf — appointment monitors, restock alerts, ticket trackers, hotel-rate&lt;br&gt;
watchers, anything — I think this pattern is worth considering.&lt;/p&gt;


&lt;h2&gt;
  
  
  The problem in 30 seconds
&lt;/h2&gt;

&lt;p&gt;Government appointment portals are a nightmare. US visa dropbox, Schengen&lt;br&gt;
visa, INM Mexico cita, passport renewals, Global Entry — all of them&lt;br&gt;
release slots at random hours, and slots get grabbed in 6 minutes.&lt;/p&gt;

&lt;p&gt;The existing tools to catch a slot fall into two camps:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Manual&lt;/strong&gt; — sit on F5 for hours/days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sketchy paid bots&lt;/strong&gt; — $50–200 services that ask for your portal
login and run a scraper on their server farm&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Camp 2 has three structural problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Security:&lt;/strong&gt; sharing your portal login with a third party is, at
best, against the portal's ToS, and at worst gets your account locked&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability:&lt;/strong&gt; server-side scrapers get IP-banned constantly, breaking
for hundreds of users at once&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale economics:&lt;/strong&gt; every user costs CPU + bandwidth on the operator's
servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted a third option. The simplest version of that idea: what if the&lt;br&gt;
scraper just... ran in the user's own already-logged-in browser tab?&lt;/p&gt;


&lt;h2&gt;
  
  
  The architecture
&lt;/h2&gt;

&lt;p&gt;Here's the whole thing on a napkin:&lt;/p&gt;

&lt;p&gt;┌────────────────────────────────────────────────┐ │ User's Chrome browser │ │ │ │ ┌─────────────┐ ┌──────────────────┐ │ │ │ Portal tab │ ←poll── │ Service worker │ │ │ │ (logged in) │ │ (background) │ │ │ └─────────────┘ └────────┬─────────┘ │ │ │ │ │ │ "slot found" └────────────────────────────────────┼───────────┘ │ ▼ ┌─────────────────────────┐ │ Firebase Cloud Func │ │ alertFanout │ └────┬──────┬──────┬──────┘ │ │ │ email push desktop&lt;/p&gt;

&lt;p&gt;Important: the portal HTML never leaves the browser. The only thing that&lt;br&gt;
travels to my server is "workflow X went available at timestamp T". That's&lt;br&gt;
the entire payload.&lt;/p&gt;


&lt;h2&gt;
  
  
  Why this beats server-side scraping
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Security: zero shared logins
&lt;/h3&gt;

&lt;p&gt;The user is already logged into the portal in their own browser. The&lt;br&gt;
extension's content script reads the page DOM in that tab. We never see&lt;br&gt;
the user's portal credentials, never store them, and never send them&lt;br&gt;
anywhere.&lt;/p&gt;

&lt;p&gt;If you're a security-minded user, you can audit the extension's source&lt;br&gt;
and verify this in 10 minutes. With a server-side competitor, you have&lt;br&gt;
to take their word for it.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Anti-ban: each user looks like one human
&lt;/h3&gt;

&lt;p&gt;Server-side scrapers funnel hundreds of users through a small pool of IPs&lt;br&gt;
and user agents. Portals notice this pattern within days and IP-ban the&lt;br&gt;
operator, breaking the service for everyone.&lt;/p&gt;

&lt;p&gt;When the scraper IS the user, that pattern disappears. Each user's&lt;br&gt;
traffic looks like — well, that user. There's nothing to fingerprint&lt;br&gt;
beyond "this person opens the portal page periodically", which is&lt;br&gt;
indistinguishable from a real user being slightly anxious.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Cost economics: zero per-user CPU on my server
&lt;/h3&gt;

&lt;p&gt;The polling is happening on the user's machine. My only server-side&lt;br&gt;
work is the alert fan-out (an HTTP call → Firestore write → email +&lt;br&gt;
push). At 1000 active users, my Firebase bill is &amp;lt; $50/mo. A&lt;br&gt;
server-side equivalent would be running a small fleet of headless&lt;br&gt;
browsers around the clock.&lt;/p&gt;
&lt;h3&gt;
  
  
  4. Captcha resilience: the user solves it
&lt;/h3&gt;

&lt;p&gt;Portals often throw captchas to deter automation. A server-side scraper&lt;br&gt;
gets stuck or has to pipe the captcha to a human-solver service (slow,&lt;br&gt;
expensive, sketchy).&lt;/p&gt;

&lt;p&gt;In my model, when the polling script hits a captcha, the page state&lt;br&gt;
becomes "captcha required" and we fire that as the alert. The user&lt;br&gt;
solves the captcha (it's their browser!) and polling resumes. No&lt;br&gt;
automated solving, ever. By design.&lt;/p&gt;


&lt;h2&gt;
  
  
  The downsides (because there always are some)
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. The user has to keep their browser open (or service-worker awake)
&lt;/h3&gt;

&lt;p&gt;Chrome aggressively suspends extension service workers. To keep polling&lt;br&gt;
running, I use the &lt;code&gt;chrome.alarms&lt;/code&gt; API with a 1-min minimum, which&lt;br&gt;
wakes the service worker briefly to do its check.&lt;/p&gt;

&lt;p&gt;This is reliable enough but it does mean if the user closes Chrome&lt;br&gt;
entirely, polling pauses. For most use cases this is fine — they only&lt;br&gt;
need monitoring during the 12 hours when slots could realistically open.&lt;/p&gt;
&lt;h3&gt;
  
  
  2. Per-user polls are slower than centralized polls
&lt;/h3&gt;

&lt;p&gt;A server farm could in theory check the portal every 10 seconds for&lt;br&gt;
all users at once, then fan out. My architecture polls every ~2 minutes&lt;br&gt;
per user, per workflow. So in theory, the centralized version catches&lt;br&gt;
a slot 1.5 minutes faster on average.&lt;/p&gt;

&lt;p&gt;In practice, slot windows are 5–15 minutes wide on the portals I've&lt;br&gt;
tested, so a 2-min poll catches them comfortably. The structural&lt;br&gt;
benefits (security, anti-ban, cost) easily outweigh the polling lag.&lt;/p&gt;
&lt;h3&gt;
  
  
  3. Workflow definitions need to be portable
&lt;/h3&gt;

&lt;p&gt;Server-side scrapers can hard-code per-portal logic. I need users&lt;br&gt;
(or me) to define workflows declaratively, because the same extension&lt;br&gt;
runs against many portals.&lt;/p&gt;

&lt;p&gt;Solution: workflows are JSON definitions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;schengen-stockholm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;entryUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://visa.vfsglobal.com/swe/en/...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;selectors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no available slots&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unavailable&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;match&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;available&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;available&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Anyone can define a new workflow without me shipping code. (In practice&lt;br&gt;
I curate the popular ones.)&lt;/p&gt;


&lt;h2&gt;
  
  
  Stack details (the boring-but-useful section)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extension:&lt;/strong&gt; Manifest V3, vanilla JS (no React/Vue — fewer build steps,
smaller bundle, faster to iterate). esbuild for bundling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend:&lt;/strong&gt; Firebase Cloud Functions (Node 20). One function per
responsibility — &lt;code&gt;alertFanout&lt;/code&gt;, &lt;code&gt;linkMintToken&lt;/code&gt;, &lt;code&gt;linkConsumeToken&lt;/code&gt;,
&lt;code&gt;webPushSubscribe&lt;/code&gt;, &lt;code&gt;sendEmail&lt;/code&gt;, &lt;code&gt;getUsage&lt;/code&gt;, &lt;code&gt;joinWaitlist&lt;/code&gt;, etc.
Eleven functions total. Each is small enough to keep in your head.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database:&lt;/strong&gt; Firestore. Workflows under &lt;code&gt;users/{uid}/workflows/{id}&lt;/code&gt;,
alert quotas under &lt;code&gt;users/{uid}/usage/{yyyy-mm}&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email:&lt;/strong&gt; Resend. Way cleaner API than SES or Mailgun for transactional.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-device push:&lt;/strong&gt; Web Push API + VAPID keys. I considered Firebase
Cloud Messaging but went with raw Web Push because (a) one fewer
dependency, (b) when iOS Safari fully ships push to homescreen apps,
Web Push will work natively. FCM would have meant another adapter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Marketing site:&lt;/strong&gt; hand-rolled static site (no Next.js, no Nuxt). A
build script reads partials and writes the dist folder. Total weight
is ~30 KB CSS + 12 KB JS.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;p&gt;If I were starting over today, three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Define the workflow JSON schema even more strictly, sooner.&lt;/strong&gt; I&lt;br&gt;
added a Zod schema in week 4. Should have done it day 1 — would have&lt;br&gt;
saved me from a class of "user submitted half-broken workflow" bugs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Build the alert quota system before the alert system.&lt;/strong&gt; I built the&lt;br&gt;
alerts first, the quotas later. The day I added quotas, I had to&lt;br&gt;
retrofit every existing alert path. Quotas first would have been&lt;br&gt;
trivial.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Treat the privacy story as the marketing story from day 1.&lt;/strong&gt; The&lt;br&gt;
biggest objection to "an extension that watches portals" is "wait,&lt;br&gt;
is this safe?" Spending a weeks polishing the privacy policy&lt;br&gt;
wording is doing marketing, not legal.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;p&gt;SlotOwl is currently in Chrome Web Store review (3 days in, fingers&lt;br&gt;
crossed). The waitlist is at &lt;/p&gt;
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://slotowl.app/" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fslotowl.app%2Fassets%2Fog-card.png" height="400" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://slotowl.app/" rel="noopener noreferrer" class="c-link"&gt;
            SlotOwl — government appointment slots, the moment they open
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            Free Chrome extension for visa and immigration appointment slots (US visa, Schengen, INM Mexico, Global Entry, passport renewal). Browser-side monitoring with desktop, email, and cross-device push alerts.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fslotowl.app%2Fassets%2Flogo.svg" width="64" height="64"&gt;
          slotowl.app
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;

&lt;p&gt;— if you (or anyone you know) is hunting an appointment, please share.&lt;/p&gt;

&lt;p&gt;Honest about the future: I don't know yet whether this is a $1k MRR&lt;br&gt;
side project or a real business. I'll know more after the first 100&lt;br&gt;
real users tell me what they're willing to pay.&lt;/p&gt;

&lt;p&gt;If you're building something similar, or you've shipped a Chrome&lt;br&gt;
extension recently and have war stories, I'd love to hear from you.&lt;br&gt;
DMs are open on insta &lt;a class="mentioned-user" href="https://dev.to/greythinkinglab"&gt;@greythinkinglab&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;— Nik / greythinkinglab&lt;br&gt;
Day 19 of a 150-day solo-founder challenge. Onwards.&lt;br&gt;
&lt;a href="https://greythinkinglab.com" rel="noopener noreferrer"&gt;https://greythinkinglab.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>chrome</category>
      <category>webdev</category>
      <category>buildinpublic</category>
    </item>
  </channel>
</rss>
