<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: George Kioko</title>
    <description>The latest articles on DEV Community by George Kioko (@the_aientrepreneur_7ae85).</description>
    <link>https://dev.to/the_aientrepreneur_7ae85</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3819055%2Fd9abfd38-f5cf-4c9c-bb04-30b1ea57dd40.jpg</url>
      <title>DEV Community: George Kioko</title>
      <link>https://dev.to/the_aientrepreneur_7ae85</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/the_aientrepreneur_7ae85"/>
    <language>en</language>
    <item>
      <title>How I lost $540/month in 30 days to silent user churn (and didn't notice)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Fri, 24 Apr 2026 04:34:05 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/how-i-lost-540month-in-30-days-to-silent-user-churn-and-didnt-notice-4m5c</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/how-i-lost-540month-in-30-days-to-silent-user-churn-and-didnt-notice-4m5c</guid>
      <description>&lt;p&gt;Last week my 30 day profit dropped from $268 to $50 and I assumed it was a bug in the dashboard.&lt;/p&gt;

&lt;p&gt;It wasn't a bug. It was me not paying attention for 29 days straight.&lt;/p&gt;

&lt;p&gt;This is a postmortem of how I lost roughly $540/month from two agency buyers who just quietly stopped running my actor on March 25. I found out on April 23. That's 29 days of ambient denial while every other part of my portfolio was also quietly rotting.&lt;/p&gt;

&lt;p&gt;Writing this partly so I remember the lesson, partly because I know at least three other Apify devs are about to make the same mistake.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;I run a bunch of scrapers and APIs on Apify. Here's what the revenue split actually looked like in early April before things went sideways.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Actor&lt;/th&gt;
&lt;th&gt;Users&lt;/th&gt;
&lt;th&gt;Monthly revenue&lt;/th&gt;
&lt;th&gt;% of total&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Google Maps Lead Intel&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;$540&lt;/td&gt;
&lt;td&gt;83%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LinkedIn Employee Scraper&lt;/td&gt;
&lt;td&gt;37&lt;/td&gt;
&lt;td&gt;$42&lt;/td&gt;
&lt;td&gt;6.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;YouTube Transcript&lt;/td&gt;
&lt;td&gt;40&lt;/td&gt;
&lt;td&gt;$28&lt;/td&gt;
&lt;td&gt;4.3%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Scholar&lt;/td&gt;
&lt;td&gt;18&lt;/td&gt;
&lt;td&gt;$14&lt;/td&gt;
&lt;td&gt;2.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email Validator API&lt;/td&gt;
&lt;td&gt;46&lt;/td&gt;
&lt;td&gt;$11&lt;/td&gt;
&lt;td&gt;1.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Website Intelligence API&lt;/td&gt;
&lt;td&gt;22&lt;/td&gt;
&lt;td&gt;$8&lt;/td&gt;
&lt;td&gt;1.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Everything else (5 actors)&lt;/td&gt;
&lt;td&gt;188&lt;/td&gt;
&lt;td&gt;$7&lt;/td&gt;
&lt;td&gt;1.1%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;353&lt;/td&gt;
&lt;td&gt;$650&lt;/td&gt;
&lt;td&gt;100%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Read that first row again. Two users. $540. More than the other 351 users combined, by a factor of like 7x.&lt;/p&gt;

&lt;p&gt;I told myself this was fine because the product was working and the buyers were happy. Both things were true in early March. Neither was true by late March. I just didn't know.&lt;/p&gt;

&lt;h2&gt;
  
  
  The silence
&lt;/h2&gt;

&lt;p&gt;March 25 was the last run either agency executed. I didn't flag it because nothing explicit broke. No error email, no angry message, no refund request. My Apify dashboard just showed fewer runs but the rolling 30 day number still looked okay because it was still averaging in the fat weeks from before.&lt;/p&gt;

&lt;p&gt;Here's roughly how the number moved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Early April: $268 rolling 30d profit. I'm feeling smug.&lt;/li&gt;
&lt;li&gt;Mid April: $92. I figure maybe it's a slow week.&lt;/li&gt;
&lt;li&gt;April 22: $50. I finally open the run logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the time I looked, the last run from either agency was 29 days ago. Whatever issue they had (I still don't fully know), they decided it wasn't worth telling me about. They just left.&lt;/p&gt;

&lt;p&gt;Agency users don't complain. They just stop paying. If you're building for them, burn that into your forehead.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I found when I actually looked
&lt;/h2&gt;

&lt;p&gt;This is the part that made me feel physically ill. Once I started doing a proper audit of my portfolio, I found that 6 of my other 10 monetized actors were broken in some way. Not all catastrophic. Some were returning partial data. One had a silently failing selector from a site redesign in February. One was charging $0 per run because of a broken &lt;code&gt;Actor.charge()&lt;/code&gt; signature I'd introduced in a refactor.&lt;/p&gt;

&lt;p&gt;Let me repeat that: I had actors that were executing successfully, returning data to users, and billing them exactly nothing. For weeks.&lt;/p&gt;

&lt;p&gt;If one of those 2 agencies had tried a second actor of mine during that period, they'd have gotten rot. That's probably why they didn't come back.&lt;/p&gt;

&lt;p&gt;The root cause wasn't the bugs though. Bugs happen. The root cause was that my attention was entirely on the $25/run cash cow because it was paying the bills. The cash cow was hiding the state of the herd.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why 2 agency users paid more than 353 devs
&lt;/h2&gt;

&lt;p&gt;I want to sit with this one because I think most solo devs misunderstand it.&lt;/p&gt;

&lt;p&gt;My Google Maps Lead Intel actor charges $25 per successful run. It scrapes a geography, enriches each business with a website audit, scores them, and hands the agency a ranked list of cold outreach targets. One agency was running it on a schedule against 40 US cities per week. At $25 a pop, that's serious money.&lt;/p&gt;

&lt;p&gt;The 353 devs on my other actors were paying $0.003 to $0.01 per row. They're hobbyists, students, one guy building a thesis scraper. They're lovely. They're also economically irrelevant to whether I can pay rent.&lt;/p&gt;

&lt;p&gt;Two lessons fell out of this.&lt;/p&gt;

&lt;p&gt;First, your paying users and your popular users are almost never the same people. Popularity on Apify Store is a vanity metric. Agency retention is the only metric that buys groceries.&lt;/p&gt;

&lt;p&gt;Second, concentrated revenue is fragile in ways that only hurt you once. When I had 2 whales, my revenue was 83% dependent on their mood. The moment either whale left, my month was destroyed. Worse, because they paid so much, I built no alerting around them. I assumed I'd notice. I did not notice.&lt;/p&gt;

&lt;p&gt;If you're reading this as an agency owner looking for scraping tools, what you actually want is a vendor who is not dependent on you. Someone with 50 paying clients will answer your ticket faster than someone with 2, because the guy with 2 is terrified of you and therefore weirdly slow to respond to bad news.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm changing
&lt;/h2&gt;

&lt;p&gt;Not writing a manifesto. Just the four things I'm doing this week.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Push alerting on every run.&lt;/strong&gt; Apify has webhooks. I never wired them up because my dashboard was enough. It wasn't. Every successful run, every failed run, every billing event now pings a private Telegram channel I actually read. Here's the whole snippet, it's embarrassingly short:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// webhooks config in actor.json points at this endpoint&lt;/span&gt;
&lt;span class="c1"&gt;// payload is whatever Apify sends plus the run metadata&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;eventType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;eventData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;resource&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`[&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;eventType&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;] &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;actId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; run &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n`&lt;/span&gt;
               &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;`status: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n`&lt;/span&gt;
               &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;`charged: $&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;usageTotalUsd&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`https://api.telegram.org/bot&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;TG_TOKEN&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/sendMessage`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;chat_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;TG_CHAT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;end&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. If I'd had this on March 25, I'd have noticed the absence of runs within 48 hours, not 29 days. If you run anything that bills users, stop reading and go wire this up. Seriously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Weekly self test of every monetized actor.&lt;/strong&gt; Every Sunday I run each of my paid actors against a known input and diff the output against last week's. If the schema changes or the row count collapses, I know before the user does. This is stupid simple and I should have been doing it from day one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Diversifying the buyer pool.&lt;/strong&gt; $25/run is staying. But I'm actively building out the $5 to $10 tier with two new actors targeted at small agencies, because I want the bottom of the revenue chart to be less wobbly. Ten $50/month buyers survive any one of them leaving. Two $270/month buyers don't.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Public status page.&lt;/strong&gt; Still building this but the idea is: if an actor is degraded, the user knows before they hit it. Trust compounds and I just burned a chunk of it, so I'm paying interest now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Close
&lt;/h2&gt;

&lt;p&gt;If you run any product that bills users on autopilot, go wire a Telegram or Slack webhook today. Not tomorrow. The 30 day rolling dashboard lies to you when things trend down because it's still averaging in good weeks. Push alerts don't lie. Runs either happen or they don't.&lt;/p&gt;

&lt;p&gt;I'm writing this mostly for me. But if you want to see what the portfolio looks like now, or hire the actor that caused all this drama in the first place, it's here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Apify profile: &lt;a href="https://apify.com/george.the.developer" rel="noopener noreferrer"&gt;https://apify.com/george.the.developer&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Google Maps Lead Intel (the $25/run one): on the same profile&lt;/li&gt;
&lt;li&gt;Everything else I've shipped (27 public actors, some broken last week, all fixed now): same profile&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Ask me anything in the comments. Especially if you're an agency buyer thinking about pulling the trigger on a vendor. I have thoughts about what you should actually be looking for.&lt;/p&gt;

</description>
      <category>apify</category>
      <category>postmortem</category>
      <category>webscraping</category>
      <category>saas</category>
    </item>
    <item>
      <title>Two APIs I Built This Week That Cost Nothing to Run</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:50:41 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/two-apis-i-built-this-week-that-cost-nothing-to-run-g3e</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/two-apis-i-built-this-week-that-cost-nothing-to-run-g3e</guid>
      <description>&lt;p&gt;Most APIs have a dirty secret in their pricing: the upstream service they call costs money, and that cost gets passed to you plus margin. LLM based APIs charge you for tokens. Geocoding APIs charge you for lookups. Data enrichment APIs charge you for the enrichment source.&lt;/p&gt;

&lt;p&gt;I wanted to build APIs where the underlying operation costs literally zero. Here are two I shipped this week.&lt;/p&gt;

&lt;h2&gt;
  
  
  API 1: DNS Record Checker
&lt;/h2&gt;

&lt;p&gt;Node.js ships with a built in &lt;code&gt;dns&lt;/code&gt; module. It can resolve A records, MX records, CNAME, TXT, NS, and more. No external API call needed. No third party service. The DNS resolution happens through the operating system's resolver, which is free.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dns&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dns/promises&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;records&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;dns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resolveAny&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;example.com&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Returns A, AAAA, MX, TXT, NS, SOA records&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Zero dependency, zero API cost, zero rate limits from upstream providers.&lt;/p&gt;

&lt;p&gt;The actor wraps this into a clean JSON API. Pass it a domain, get back every DNS record type with TTLs, priorities for MX records, and SPF/DKIM/DMARC validation. The whole thing runs on Apify's Standby infrastructure so it responds in under a second.&lt;/p&gt;

&lt;p&gt;Use cases that keep coming up: automated domain verification for SaaS onboarding, email deliverability checks (MX + SPF + DKIM in one call), security audits scanning for misconfigured DNS, and monitoring tools that alert when records change unexpectedly.&lt;/p&gt;

&lt;h2&gt;
  
  
  API 2: Sentiment Analysis
&lt;/h2&gt;

&lt;p&gt;The common approach to sentiment analysis is sending text to an LLM and paying per token. That works but it's expensive at scale and adds latency.&lt;/p&gt;

&lt;p&gt;Instead I used a word level lexicon approach. The API scores text using a pre built dictionary of ~7,000 words with known sentiment values. No LLM call. No external API. The scoring runs entirely in memory on the Node.js process.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Simplified version of the scoring logic&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;words&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;word&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;lexicon&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;word&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;words&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The result includes an overall sentiment score, confidence level, and breakdown of positive vs negative word matches. It handles negation ("not good" scores negative) and intensifiers ("very good" scores higher than "good").&lt;/p&gt;

&lt;p&gt;Is it as nuanced as GPT? No. But for brand monitoring, review analysis, social media tracking, and content moderation at scale, a deterministic lexicon approach that returns in 50ms beats a 2 second LLM call that costs 10x more.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern worth noticing
&lt;/h2&gt;

&lt;p&gt;Both of these APIs follow the same principle: use what's already built into the runtime or ship a static dataset with the code. No external dependencies that cost money per call.&lt;/p&gt;

&lt;p&gt;This matters because of what I've seen with my existing domain tools. The WHOIS Lookup actor has power users running 262 lookups per user on average. Domain and DNS tools get embedded in automated workflows and run at high volume. When your per call cost is zero, your margin stays healthy no matter how much a single user hammers the API.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;DNS Record Checker: $0.003 per lookup. Sentiment Analysis: $0.003 per text analysis. Both running on Apify Standby mode for instant responses.&lt;/p&gt;

&lt;p&gt;The infrastructure cost is just Apify compute time. No upstream API bills eating into revenue.&lt;/p&gt;

&lt;p&gt;Try them on Apify:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://apify.com/george.the.developer/dns-record-checker" rel="noopener noreferrer"&gt;DNS Record Checker&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://apify.com/george.the.developer/sentiment-analysis-api" rel="noopener noreferrer"&gt;Sentiment Analysis API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Built in Nairobi. 52 actors, zero external API costs on these two. Comments and questions welcome.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
    </item>
    <item>
      <title>2 Users Pay Me More Than 353 Users: The Pricing Lesson That Changed Everything</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:49:30 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/2-users-pay-me-more-than-353-users-the-pricing-lesson-that-changed-everything-4pf8</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/2-users-pay-me-more-than-353-users-the-pricing-lesson-that-changed-everything-4pf8</guid>
      <description>&lt;p&gt;I have 48 actors running on Apify. Same platform, same developer, same tech stack. Two of those actors tell completely different stories about how software makes money.&lt;/p&gt;

&lt;p&gt;My LinkedIn Employee Scraper has 353 users. It runs thousands of times per month. It charges $0.005 per profile scraped. Total monthly revenue from all those users and all those runs? About $9.&lt;/p&gt;

&lt;p&gt;My Google Maps Lead Intel actor has 2 users. Two. They run it about 22 times per month between them, paying roughly $25 per run. Monthly revenue? Around $540.&lt;/p&gt;

&lt;p&gt;That is a 60x difference in revenue per user. Same platform. Same developer. Same billing system.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Makes Google Maps Worth $25 a Run
&lt;/h2&gt;

&lt;p&gt;The LinkedIn scraper returns raw data. Names, titles, company info. It does one thing and does it well, but developers treat it like a commodity. They plug it into their own pipelines and expect it to cost almost nothing. At $0.005 per profile, it basically does.&lt;/p&gt;

&lt;p&gt;Google Maps Lead Intel returns something different. For every business it finds, you get validated email addresses, a lead score based on 12 online presence signals, Google Ads detection, website tech stack analysis, social media profiles, and review sentiment. It is not scraping. It is intelligence.&lt;/p&gt;

&lt;p&gt;The two users paying $25 per run are lead generation agencies. One services appointment setting clients across 15 metro areas. The other runs local SEO audits. For both of them, a single $25 run replaces 3 to 4 hours of manual research that would cost $200+ if done by a VA.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Buyer Problem
&lt;/h2&gt;

&lt;p&gt;Here is what I missed for months: the LinkedIn scraper attracts developers. Developers are price sensitive. They can build their own scraper given enough time, so they benchmark your tool against their hourly rate. If your scraper costs more than 20 minutes of their time to build, they will build it themselves.&lt;/p&gt;

&lt;p&gt;The Google Maps actor attracts agencies. Agency buyers think in terms of client value, not engineering time. If their client pays $1,500/month for lead gen services and your tool costs $25 per market, that is a rounding error in their margin. They do not negotiate. They do not churn. They run it more as they sign more clients.&lt;/p&gt;

&lt;p&gt;Same platform. Totally different buyer psychology.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Changed
&lt;/h2&gt;

&lt;p&gt;The technical shift was not dramatic. I stopped returning raw JSON blobs and started returning enriched, scored, validated output. Specifically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Raw Google Maps results became leads with quality scores&lt;/li&gt;
&lt;li&gt;Guessed emails became validated emails with deliverability checks&lt;/li&gt;
&lt;li&gt;Basic business info became competitive intelligence with ad spend signals&lt;/li&gt;
&lt;li&gt;Flat data became actionable reports that agencies could forward to clients&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The pricing shift followed naturally. When your output saves someone 4 hours of work and costs them $25, you are not competing on data volume. You are competing on time saved and decision quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers I Wish I Knew Earlier
&lt;/h2&gt;

&lt;p&gt;353 users at $0.005/run = roughly $9/month. Those users submit support tickets, request features, and compare you to 6 other LinkedIn scrapers in the Apify Store.&lt;/p&gt;

&lt;p&gt;2 users at $25/run = roughly $540/month. Those users send you "thank you" messages and ask if you can build them something custom.&lt;/p&gt;

&lt;p&gt;If I could go back and rebuild my portfolio from scratch, I would build fewer tools and make each one solve a complete problem for a specific buyer. Not "scrape this website" but "find me qualified leads in this market with contact info I can trust."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;Stop counting users. Start counting revenue per user. Build for the buyer who measures your tool against the cost of the alternative, not against the cost of building it themselves. Package intelligence, not data.&lt;/p&gt;

&lt;p&gt;The developer who needs 10,000 LinkedIn profiles will always shop on price. The agency owner who needs 200 qualified leads by Friday will pay whatever gets it done.&lt;/p&gt;

&lt;p&gt;I know which buyer I am building for now.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in Nairobi. 48 actors in production. Questions? Drop them below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>saas</category>
      <category>api</category>
      <category>startup</category>
    </item>
    <item>
      <title>The 5 APIs That Run 200+ Times Per User (And Why That Matters)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:47:43 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/the-5-apis-that-run-200-times-per-user-and-why-that-matters-48fp</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/the-5-apis-that-run-200-times-per-user-and-why-that-matters-48fp</guid>
      <description>&lt;p&gt;Most developer tools get used a handful of times. Someone finds your API, tries it on a test case, maybe runs it a dozen more times, then moves on. That is the normal pattern. Out of 38 actors I have running on Apify, most average 5 to 20 runs per user. Respectable numbers.&lt;/p&gt;

&lt;p&gt;But five of them break the pattern completely. These five average 100 to 260 runs per user. Not because of better marketing or a viral tweet. Because they solve problems that require bulk processing by design.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Here is the actual usage data from my Apify dashboard:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;API&lt;/th&gt;
&lt;th&gt;Runs Per User&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Domain WHOIS Lookup&lt;/td&gt;
&lt;td&gt;262&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Google Scholar Scraper&lt;/td&gt;
&lt;td&gt;230&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI Content Detector&lt;/td&gt;
&lt;td&gt;132&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Website Tech Detector&lt;/td&gt;
&lt;td&gt;126&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email Validator&lt;/td&gt;
&lt;td&gt;105&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Compare that to something like the LinkedIn Employee Scraper, which has 37 users but averages about 17 runs each. LinkedIn users grab the data they need and stop. WHOIS users feed in hundreds of domains every single session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why These Five?
&lt;/h2&gt;

&lt;p&gt;The common thread is not the subject matter. It is the workflow. Every one of these tools plugs into a process where the user already has a list and needs to process all of it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain WHOIS Lookup (262 runs/user):&lt;/strong&gt; Security researchers and domain investors run this on batches of suspicious domains. When a phishing campaign registers 10,000 domains with similar naming patterns, someone needs registrar data, creation dates, and nameservers for every single one. That is not a one time task. New domains appear daily.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Google Scholar Scraper (230 runs/user):&lt;/strong&gt; Academic researchers doing systematic literature reviews or bibliometric analysis. They need every paper matching a query, with citations, h index scores, and author profiles exported as structured JSON. One research project can require pulling data on thousands of papers across multiple search terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI Content Detector (132 runs/user):&lt;/strong&gt; Content moderation teams, academic integrity offices, and publishers who need to scan entire content catalogs. Checking one essay at a time is pointless when you have 500 submissions or 2,000 product descriptions to verify. The bulk API call is the only thing that makes this practical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Website Tech Detector (126 runs/user):&lt;/strong&gt; Sales development teams that need technology intelligence on their entire prospect list. If you are selling a React migration service, you need to know which of your 3,000 target companies still run Angular or jQuery. Feed in the list, get back frameworks, CDNs, analytics tools, CMS platforms in clean JSON.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Email Validator (105 runs/user):&lt;/strong&gt; Cold outreach operators who clean their lists before every campaign. A 5% bounce rate destroys your sender reputation, so smart operators validate 500 to 5,000 emails before hitting send. They do this before every single campaign, not once.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Builders
&lt;/h2&gt;

&lt;p&gt;The lesson is simple: if your API solves a problem that people encounter once, you need constant marketing to keep new users flowing in. If your API solves a problem that people encounter in batches, repeatedly, you get sticky users who come back on their own.&lt;/p&gt;

&lt;p&gt;None of these five APIs went viral. None of them got featured in a newsletter. The WHOIS lookup has 7 total users. But those 7 users have collectively run it 1,837 times. That is revenue without marketing spend.&lt;/p&gt;

&lt;p&gt;The best APIs are not the ones with the most users. They are the ones where each user cannot stop running them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try Them
&lt;/h2&gt;

&lt;p&gt;All five are live on the Apify Store under my profile (george.the.developer), priced per call with no monthly subscription. Domain WHOIS at $0.005/lookup, Scholar at $0.004/paper, AI Detector at $0.003/text, Tech Detector at $0.005/site, Email Validator at $0.002/email.&lt;/p&gt;

&lt;p&gt;Built in Nairobi. 38 actors, 700+ users, 14,000+ total runs.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>api</category>
      <category>saas</category>
    </item>
    <item>
      <title>Google Scholar Has No API Either. Here's What 5,000 Runs Taught Me</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:46:16 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/google-scholar-has-no-api-either-heres-what-5000-runs-taught-me-3i44</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/google-scholar-has-no-api-either-heres-what-5000-runs-taught-me-3i44</guid>
      <description>&lt;p&gt;Google Scholar is the single most important search engine for academic research. Billions of papers indexed, citation counts, author profiles, related work links. And Google has never released an official API for it.&lt;/p&gt;

&lt;p&gt;Not deprecated. Not restricted. Just... never built one.&lt;/p&gt;

&lt;p&gt;If you want to programmatically search Google Scholar, grab paper titles, authors, citation counts, and PDF links, you are on your own. So I built an actor that does exactly that.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Pulls
&lt;/h2&gt;

&lt;p&gt;You give it a search query (like "transformer architecture attention mechanism") and it returns structured data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Attention Is All You Need"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A Vaswani, N Shazeer, N Parmar..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"citationCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;112847&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"year"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2017"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://arxiv.org/abs/1706.03762"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pdfUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://arxiv.org/pdf/1706.03762"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"snippet"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The dominant sequence transduction models are based on complex recurrent..."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Paper titles, author lists, citation counts, publication year, direct links, and PDF URLs when available. Everything a researcher needs to build a literature review or track citations over time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers Tell a Story
&lt;/h2&gt;

&lt;p&gt;Here's where it gets interesting. The actor has &lt;strong&gt;22 users&lt;/strong&gt; and &lt;strong&gt;5,065 total runs&lt;/strong&gt;. Do the math on that ratio: 230 runs per user on average.&lt;/p&gt;

&lt;p&gt;These are not casual users clicking "Run" once to test it. These are power users running it at scale. Academics building citation databases. Research firms tracking publication trends across thousands of queries. AI companies monitoring new papers in their domain.&lt;/p&gt;

&lt;p&gt;That run to user ratio is the strongest signal I have that this tool solves a real problem. When someone runs your tool 200+ times, they have built it into a workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Scholar Is Hard to Scrape
&lt;/h2&gt;

&lt;p&gt;Google Scholar is notoriously aggressive about blocking automated access. It will throw CAPTCHAs after just a handful of requests from the same IP. Most simple scraping scripts break within minutes.&lt;/p&gt;

&lt;p&gt;The actor handles this with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Proxy rotation across residential IPs&lt;/li&gt;
&lt;li&gt;Session management to maintain cookies between requests&lt;/li&gt;
&lt;li&gt;Randomized delays that mimic human browsing patterns&lt;/li&gt;
&lt;li&gt;Automatic retry logic when a request gets blocked&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also had to deal with Google's inconsistent HTML. Scholar's markup changes subtly over time. Element class names shift, layout structures get tweaked. The parser needs regular maintenance to keep working.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Uses This
&lt;/h2&gt;

&lt;p&gt;Three main groups keep showing up:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Academics and PhD students&lt;/strong&gt; building systematic literature reviews. Instead of manually searching and copying results, they run batch queries and get structured data they can feed into reference managers or spreadsheets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Research firms and think tanks&lt;/strong&gt; tracking publication trends. They want to know how many papers mention "large language models" per quarter, or which authors are publishing most frequently in a specific subfield.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI and ML teams&lt;/strong&gt; monitoring state of the art. When a new paper drops with high early citation velocity, they want to know about it fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The actor is on the Apify Store with pay per result pricing ($0.004 per paper): &lt;a href="https://apify.com/george.the.developer/google-scholar-scraper" rel="noopener noreferrer"&gt;Google Scholar Scraper&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you have ever copy pasted results from Google Scholar into a spreadsheet, this will save you hours. And if you are doing it at scale, it will save you from getting IP banned.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in Nairobi by George. 40+ actors, 5,000+ runs on Scholar alone.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>api</category>
      <category>research</category>
    </item>
    <item>
      <title>YouTube Has No Transcript API So I Built One (150 Users Later)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:44:51 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/youtube-has-no-transcript-api-so-i-built-one-150-users-later-4p56</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/youtube-has-no-transcript-api-so-i-built-one-150-users-later-4p56</guid>
      <description>&lt;p&gt;You know what's wild? YouTube, a Google product, has no official API for pulling video transcripts. You can upload, search, and manage playlists through their API. But if you want the actual words spoken in a video? Good luck.&lt;/p&gt;

&lt;p&gt;I ran into this wall in late 2025 while building a content repurposing tool. I needed transcripts from YouTube videos to feed into an LLM for summarization. The YouTube Data API v3 gives you metadata, thumbnails, view counts. But transcripts? Nope.&lt;/p&gt;

&lt;p&gt;So I built my own.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Actually Does
&lt;/h2&gt;

&lt;p&gt;The actor loads a YouTube video page, grabs the auto generated captions that YouTube creates for most videos, and returns clean text with timestamps. It supports multiple languages because YouTube generates captions in different languages automatically.&lt;/p&gt;

&lt;p&gt;Here's what the output looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.youtube.com/watch?v=dQw4w9WgXcQ"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Example Video Title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"transcript"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Welcome to this tutorial"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.5&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Today we are going to cover"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"start"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;2.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No API key needed. No OAuth flows. Just pass in a video URL and get the transcript back.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers After 8 Months
&lt;/h2&gt;

&lt;p&gt;I published this on the Apify Store and kind of forgot about it for a while. Then I checked the dashboard:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;154 users&lt;/strong&gt; have tried it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;1,737 total runs&lt;/strong&gt; across all users&lt;/li&gt;
&lt;li&gt;It's one of my most popular actors out of 40+&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thing that surprised me was who's using it. I expected developers. And yes, developers building AI pipelines are a big chunk. But I also see researchers pulling transcripts from lecture series, content creators repurposing their own videos into blog posts, and marketing teams analyzing competitor video content.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Hard Parts
&lt;/h2&gt;

&lt;p&gt;YouTube does not make this easy. Captions are loaded dynamically through a separate request after the page renders. The URL for the caption track is embedded inside a massive JSON blob in the page source. Finding and parsing that reliably took more debugging than the actual extraction logic.&lt;/p&gt;

&lt;p&gt;The other challenge: some videos have manually uploaded captions, some have auto generated ones, and some have both. The actor handles all three cases and lets you pick which language you want.&lt;/p&gt;

&lt;p&gt;Rate limiting is real too. YouTube will throttle you if you hammer it. The actor spaces out requests and uses session management to stay under the radar.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Use a Python Library?
&lt;/h2&gt;

&lt;p&gt;There are Python packages like &lt;code&gt;youtube_transcript_api&lt;/code&gt; that do something similar. They work fine for one off scripts. But when you need to run this at scale, on a schedule, with proxy rotation and automatic retries, you want infrastructure around it.&lt;/p&gt;

&lt;p&gt;That's what Apify gives you. The actor runs in the cloud, handles failures gracefully, and stores results in a dataset you can export to JSON, CSV, or push to a webhook.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The actor is free to run on Apify (you just pay for compute, which is pennies per video): &lt;a href="https://apify.com/george.the.developer/youtube-transcript-scraper" rel="noopener noreferrer"&gt;YouTube Transcript Extractor&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are building anything that needs video content as text, save yourself the headache of reverse engineering YouTube's caption system. Someone already did that part for you.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in Nairobi by George. 40+ actors on the Apify Store, 154 users on this one alone.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>api</category>
      <category>youtube</category>
    </item>
    <item>
      <title>My LinkedIn Scraper Just Hit Top 20 on Apify — Here's How I Built It</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 23 Apr 2026 11:41:25 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/my-linkedin-scraper-just-hit-top-20-on-apify-heres-how-i-built-it-3j5p</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/my-linkedin-scraper-just-hit-top-20-on-apify-heres-how-i-built-it-3j5p</guid>
      <description>&lt;p&gt;I woke up last week to an email from Apify saying my LinkedIn Employee Scraper had earned the Rising Star badge — meaning it cracked the top 20 actors on the entire platform. 176 users, 2,430 runs, and counting.&lt;/p&gt;

&lt;p&gt;This is the story of how a side project built in Nairobi turned into one of the most-used LinkedIn scrapers on Apify.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: LinkedIn Has No Real API for Employee Data
&lt;/h2&gt;

&lt;p&gt;If you've ever tried to pull employee data from LinkedIn programmatically, you already know the pain. LinkedIn's official API is locked down tight — you need partner status or a Sales Navigator license ($800–$1,200/month) just to get basic company employee info.&lt;/p&gt;

&lt;p&gt;For indie developers, recruiters building internal tools, or startups doing competitive intel, that price tag kills the project before it starts. I needed a different approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  How It Works: Playwright + Crawlee + Anti-Detection
&lt;/h2&gt;

&lt;p&gt;The scraper runs as an Apify Actor using Crawlee (Apify's open-source crawling framework) with Playwright driving a real Chromium browser. Here's the core pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PlaywrightCrawler&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;crawlee&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getInput&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;linkedinUrls&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;maxProfiles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;crawler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PlaywrightCrawler&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;proxyConfiguration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createProxyConfiguration&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;groups&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;RESIDENTIAL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;launchContext&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;launchOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;headless&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--no-sandbox&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--disable-blink-features=AutomationControlled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;minConcurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxConcurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;requestHandler&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Human-like delay between actions&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitForTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;random&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;employees&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="nf"&gt;$eval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.org-people-profile-card&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cards&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nx"&gt;cards&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;card&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;card&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.artdeco-entity-lockup__title&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;innerText&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;card&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.artdeco-entity-lockup__subtitle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;innerText&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="na"&gt;profileUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;card&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;a&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)?.&lt;/span&gt;&lt;span class="nx"&gt;href&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;}))&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;charge&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;eventName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;profile-scraped&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;pushData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;employees&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;crawler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addRequests&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;linkedinUrls&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt; &lt;span class="p"&gt;})));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;crawler&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;Actor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The architecture isn't complicated, but the details are what make it survive in production. LinkedIn is one of the most aggressive anti-bot platforms out there, so every layer matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Keeps It Running
&lt;/h2&gt;

&lt;p&gt;Three things separate a LinkedIn scraper that works once from one that runs 2,430 times without breaking:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Session management.&lt;/strong&gt; Instead of logging in fresh every run, the scraper persists cookies and reuses sessions. This mimics real user behavior and avoids triggering LinkedIn's "new device" verification flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Residential proxies.&lt;/strong&gt; Datacenter IPs get flagged within minutes on LinkedIn. The actor routes through Apify's residential proxy pool, rotating IPs per request. Each request looks like it comes from a different home internet connection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Randomized timing.&lt;/strong&gt; No fixed delays. Every pause between actions uses &lt;code&gt;Math.random()&lt;/code&gt; to vary between 2–5 seconds. Linear timing patterns are the easiest signal for bot detection systems to catch.&lt;/p&gt;

&lt;p&gt;I also limit concurrency to 1–2 parallel requests max. It's slower, but LinkedIn's rate limiting is harsh enough that going faster just burns through proxy credits with nothing to show for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;Here's where the scraper stands today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;176 users&lt;/strong&gt; on Apify Store&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;2,430 total runs&lt;/strong&gt; in production&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rising Star badge&lt;/strong&gt; — top 20 actor on the platform&lt;/li&gt;
&lt;li&gt;Pay-per-event pricing at $0.004 per profile scraped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For context, I launched this about a year ago as one of my first Apify actors. It started getting steady traction around the 500-run mark, and growth has been compounding since. The Rising Star badge was a genuine surprise — I didn't realize it had climbed that high until the notification hit my inbox.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned Building Scrapers at Scale
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;LinkedIn changes its DOM constantly.&lt;/strong&gt; I've had to update selectors at least four times. If you build a LinkedIn scraper, abstract your selectors into a config object so you can patch them without rewriting handler logic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Users will throw anything at your actor.&lt;/strong&gt; Company pages with 50,000 employees, URLs with typos, private profiles, pages behind auth walls. Defensive coding isn't optional — it's the entire job. Every edge case that crashes your actor is a 1-star review waiting to happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pay-per-event pricing works.&lt;/strong&gt; Charging per profile scraped instead of per run aligns cost with value. Users scraping 10 profiles pay less than users scraping 10,000. This keeps casual users happy while still generating real revenue from power users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Good README = more users.&lt;/strong&gt; My most-used actors all have detailed READMEs with input/output examples, Mermaid architecture diagrams, and clear pricing breakdowns. Developers don't install tools they can't understand in 30 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I'm currently running 38+ actors on Apify covering everything from Google Scholar to Telegram channels to OFAC sanctions data. The LinkedIn scraper remains my top performer, and I'm working on v2 with better pagination handling and support for scraping by department filters.&lt;/p&gt;

&lt;p&gt;If you're building scrapers and want to see the code, everything is on GitHub. If you just need LinkedIn employee data without building anything, the actor is ready to run on Apify Store.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Apify Store&lt;/strong&gt;: &lt;a href="https://apify.com/george.the.developer" rel="noopener noreferrer"&gt;https://apify.com/george.the.developer&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/the-ai-entrepreneur-ai-hub" rel="noopener noreferrer"&gt;https://github.com/the-ai-entrepreneur-ai-hub&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built in Nairobi. Questions about the scraper or Apify actors in general — drop them in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>webdev</category>
      <category>scraping</category>
      <category>node</category>
    </item>
    <item>
      <title>I Built a TikTok Shop Affiliate Finder That Maps Every Creator Promoting a Product in 2 Minutes</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Fri, 17 Apr 2026 07:59:48 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/i-built-a-tiktok-shop-affiliate-finder-that-maps-every-creator-promoting-a-product-in-2-minutes-7mc</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/i-built-a-tiktok-shop-affiliate-finder-that-maps-every-creator-promoting-a-product-in-2-minutes-7mc</guid>
      <description>&lt;h1&gt;
  
  
  I Built a TikTok Shop Affiliate Finder That Maps Every Creator Promoting a Product in 2 Minutes
&lt;/h1&gt;

&lt;p&gt;TikTok Shop is the most interesting place in ecommerce right now. Products go from zero to 200K sales in a week. The reason is the creator economy: thousands of TikTok accounts run affiliate videos, and the good ones can move inventory faster than any paid ad.&lt;/p&gt;

&lt;p&gt;But if you are a seller, an affiliate manager, or just a researcher, one question never gets a clean answer: &lt;strong&gt;"Who is actually promoting this product?"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;TikTok does not give you an API for that. You have to scroll through the product page, click into videos, click into creator profiles, one by one. An hour of clicking for one product. If you are researching a catalog of 50 products, good luck.&lt;/p&gt;

&lt;p&gt;So I built a scraper that does it in two minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Live example
&lt;/h2&gt;

&lt;p&gt;I tested it on a real TikTok Shop product earlier today. A PRETTYGARDEN two-piece lounge set, 205K sales, 4.6 star rating. Here is what came back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"product"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PRETTYGARDEN Crewneck Two-Piece Set For Women Summer Casual Oversized Split Hem Shirts &amp;amp; Side Pocket Biker Shorts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"$33.99"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"salesCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"205915"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"rating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://shop.tiktok.com/view/product/1730228441641554287"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"affiliates"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kay and Tay"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"685.6K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Anna Scott"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"176.7K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Christina Moceanu Chapman"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"153.5K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Kelsey Pumel Woods"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"132.3K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stay at sam"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"122.6K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"username"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Candy Holcombe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"followers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"79.5K"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"qualityState"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"verified_success"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That took one product URL as input and came back with a ranked creator list in about two minutes. No login, no cookies, no scrolling.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why TikTok Shop is painful to scrape
&lt;/h2&gt;

&lt;p&gt;If you tried to build this yourself, you would hit three walls in the first hour:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anti-bot fingerprinting.&lt;/strong&gt; Naive HTTP requests get blocked instantly. The product page renders behind JS and their bot detection is aggressive. You need a real Chromium with realistic fingerprints.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dynamic creator blocks.&lt;/strong&gt; The "videos for this product" section loads through a GraphQL-style API call triggered after scroll. You have to wait for it, capture it, parse it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Frontend changes.&lt;/strong&gt; TikTok ships UI updates every couple of weeks. Selectors that worked last month break this month. A reliable scraper needs fallback sources, not just one happy path.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture: multi stage fallback
&lt;/h2&gt;

&lt;p&gt;This is the part that makes it actually work in production.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Product URL
  --&amp;gt; Stage 1: Direct TikTok Shop page via residential proxy + Chromium
      --&amp;gt; Extract product data, scroll to creator block, capture API response
  --&amp;gt; Stage 2 (if Stage 1 blocks): TikTok product detail API fallback
      --&amp;gt; Structured product JSON recovered
  --&amp;gt; Stage 3 (if Stages 1-2 fail): Google SERP snippet extraction
      --&amp;gt; Product name, price, sales recovered from cached pages
  --&amp;gt; Output: structured JSON + qualityState tag
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every result gets a &lt;code&gt;qualityState&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;verified_success&lt;/code&gt; — product and creators both confirmed from live TikTok&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;partial_verified&lt;/code&gt; — product confirmed, some creator data from cached sources&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;serp_enriched&lt;/code&gt; — product recovered from Google cached pages, creators not available&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The quality tag is the part I am most proud of. You always know what you got. No silent data quality drops.&lt;/p&gt;

&lt;h2&gt;
  
  
  Use cases I did not expect
&lt;/h2&gt;

&lt;p&gt;When I built this I thought it was for dropshippers picking products. Turns out the most interesting users are:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Affiliate managers at agencies.&lt;/strong&gt; They run it across a client's product catalog to see which creators are already pushing similar products. Instead of cold outreach to random influencers, they recruit creators who have proven they can sell in the category.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Competitor researchers.&lt;/strong&gt; Someone is winning a niche. You want to know why. Run the scraper on their top three products. You get the exact creator list. You see whose videos are driving the sales.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Freelancers backfilling TikTok Shop research gigs.&lt;/strong&gt; People charge $200-500 on Upwork for TikTok product research reports. Running the actor is a fraction of that cost. The margin is the point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sellers validating before inventory.&lt;/strong&gt; Before buying 1000 units of a trending product, check if creators are actually running videos for it. If there is no affiliate activity, the sales might be paid ads, not organic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The actor runs on Apify. One click deploy, no infrastructure to manage.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://apify.com/george.the.developer/tiktok-shop-affiliate-sales-scraper" rel="noopener noreferrer"&gt;TikTok Shop Affiliate Scraper on Apify&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Input: product URL or search query. Output: structured JSON with product data and creator list. You pay per verified product returned.&lt;/p&gt;

&lt;p&gt;Source code and full README (with Chinese translation for SEA markets):&lt;br&gt;
&lt;a href="https://github.com/the-ai-entrepreneur-ai-hub/tiktok-shop-scraper" rel="noopener noreferrer"&gt;github.com/the-ai-entrepreneur-ai-hub/tiktok-shop-scraper&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Related tools I built
&lt;/h2&gt;

&lt;p&gt;If you found this useful, these might help too:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apify.com/george.the.developer/influencer-marketing-intel" rel="noopener noreferrer"&gt;Influencer Marketing Intelligence&lt;/a&gt; finds influencers across Instagram, TikTok, YouTube by name or keyword&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/george.the.developer/google-maps-lead-intel" rel="noopener noreferrer"&gt;Google Maps Lead Intel&lt;/a&gt; local business leads with email validation baked in&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/george.the.developer/email-validator-api" rel="noopener noreferrer"&gt;Email Validator API&lt;/a&gt; SMTP check, not just regex&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full catalog: &lt;a href="https://apify.com/george.the.developer" rel="noopener noreferrer"&gt;apify.com/george.the.developer&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;Built this on Apify with Crawlee + Puppeteer. Questions, bugs, or weird TikTok edge cases: &lt;a href="https://x.com/ai_in_it" rel="noopener noreferrer"&gt;@ai_in_it&lt;/a&gt; on X.&lt;/p&gt;

</description>
      <category>apify</category>
      <category>tiktok</category>
      <category>webscraping</category>
      <category>affiliatemarketing</category>
    </item>
    <item>
      <title>One API Call to Know Everything About a Company (Domain to Intel)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 16 Apr 2026 17:15:03 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/one-api-call-to-know-everything-about-a-company-domain-to-intel-1nj4</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/one-api-call-to-know-everything-about-a-company-domain-to-intel-1nj4</guid>
      <description>&lt;p&gt;Your sales team sends you a list of 200 company domains. They want: tech stack, employee count estimate, social profiles, contact emails, and "anything else useful." By tomorrow.&lt;/p&gt;

&lt;p&gt;You could open each website manually, run BuiltWith, check LinkedIn, look up WHOIS, verify emails. That is about 10 minutes per company. 200 companies = 33 hours.&lt;/p&gt;

&lt;p&gt;Or you could make one API call per domain and get everything back as JSON.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the API returns
&lt;/h2&gt;

&lt;p&gt;Pass in a domain like &lt;code&gt;stripe.com&lt;/code&gt; and get:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech stack&lt;/strong&gt;: React, Next.js, Google Analytics, Vercel. Detected from HTML patterns, meta tags, and response headers. 60+ technologies across CMS, frameworks, analytics, infrastructure, ecommerce, and security.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Social profiles&lt;/strong&gt;: Twitter, LinkedIn, Facebook, Instagram, GitHub. Extracted from page HTML, not from a database that is 6 months stale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contact info&lt;/strong&gt;: Email addresses found on the site plus phone numbers. Not scraped from a third party. Found on the actual company website.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;DNS intelligence&lt;/strong&gt;: MX records (tells you their email provider: Google Workspace, Microsoft 365, Zoho), SPF/DMARC records (tells you email security posture), A records (tells you hosting).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SEO signals&lt;/strong&gt;: Title, meta description, H1 tags, canonical URL, OG image. Useful for competitive analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SSL status&lt;/strong&gt;: Certificate valid, issuer, expiry date.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real output
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stripe.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Stripe | Financial Infrastructure to Grow Your Revenue"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"technologies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"react"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"framework"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"nextjs"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"framework"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"google-analytics"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"analytics"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vercel"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"infra"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"socialProfiles"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"twitter"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"twitter.com/stripe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"linkedin"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"linkedin.com/company/stripe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"facebook"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"facebook.com/StripeHQ"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"instagram"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"instagram.com/stripehq"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"github"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"github.com/stripe"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"emails"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"contact@stripe.com"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mxRecords"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"exchange"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aspmx.l.google.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"priority"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ssl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cost
&lt;/h2&gt;

&lt;p&gt;$0.01 per domain. Enrich 1,000 companies for $10.&lt;/p&gt;

&lt;p&gt;Compare: BuiltWith API starts at $295/month. Clearbit enrichment is $99/month minimum. Hunter.io is $49/month for 500 lookups.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;george.the.developer/company-enrichment-api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;domain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stripe.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;list_items&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Tech: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;technologies&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Social: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;socialProfiles&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;keys&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Emails: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;emails&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Also available on RapidAPI with a free tier: &lt;a href="https://rapidapi.com/georgethedeveloper3046" rel="noopener noreferrer"&gt;https://rapidapi.com/georgethedeveloper3046&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I build data tools. 57 actors on Apify Store, 869 users. @ai_in_it on X.&lt;/p&gt;

</description>
      <category>python</category>
    </item>
    <item>
      <title>TikTok Shop Affiliate Intelligence: Find Who Promotes Any Product</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 16 Apr 2026 17:07:31 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/tiktok-shop-affiliate-intelligence-find-who-promotes-any-product-1a0p</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/tiktok-shop-affiliate-intelligence-find-who-promotes-any-product-1a0p</guid>
      <description>&lt;p&gt;TikTok Shop is a $20 billion marketplace and growing. Sellers, agencies, and affiliate managers all need to answer the same question: who is actually promoting this product, and is it working?&lt;/p&gt;

&lt;p&gt;There is no public dashboard for this. TikTok Shop does not expose affiliate data through any official API. The seller center shows you your own affiliates, not your competitors. So everyone does manual research, scrolling through product pages and creator profiles for hours.&lt;/p&gt;

&lt;p&gt;I built a tool that automates this entire process.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it does
&lt;/h2&gt;

&lt;p&gt;Give it a TikTok Shop product URL. It returns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verified product details (name, price, sales count, rating)&lt;/li&gt;
&lt;li&gt;Which creators are actively promoting it&lt;/li&gt;
&lt;li&gt;Each creator's follower count and profile URL&lt;/li&gt;
&lt;li&gt;Whether the data is verified from live sources or recovered from cached data&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The output is structured JSON that you can plug into spreadsheets, dashboards, or CRM tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters for sellers
&lt;/h2&gt;

&lt;p&gt;Before you launch a product or invest in inventory, you want to know if similar products already have affiliate momentum. If 10 creators with 500K+ followers are pushing a competitor product and it has 50K sales, that tells you there is demand and a distribution channel.&lt;/p&gt;

&lt;p&gt;If nobody is promoting similar products, either you found a gap or the category does not sell on TikTok Shop. Either way, the data saves you from guessing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why agencies care
&lt;/h2&gt;

&lt;p&gt;If you manage creator partnerships for brands, you need to know who is already driving sales in a category. Those creators have proven they can convert TikTok traffic into purchases. Recruiting a creator who already sells skincare is cheaper and faster than finding someone who might.&lt;/p&gt;

&lt;h2&gt;
  
  
  The freelancer angle
&lt;/h2&gt;

&lt;p&gt;This is where the math gets interesting. There are gigs on Upwork and Fiverr where clients pay $200 to $500 for "TikTok Shop product research" or "find affiliates for my product." The deliverable is a report showing which products are winning and who promotes them.&lt;/p&gt;

&lt;p&gt;Running this actor on 20 products costs about $1. Formatting the output into a report takes 20 minutes. The margin is 99%.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to use it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Direct product URL
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.tiktok.com/view/product/1234567890"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Search by keyword
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"vitamin c serum"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxProducts"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Python
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;george.the.developer/tiktok-shop-affiliate-sales-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;productUrl&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.tiktok.com/view/product/1234567890&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Product: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;product&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;affiliates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  @&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;username&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;followers&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; followers)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;$0.05 per product analyzed. Check 20 products for $1. No monthly subscription.&lt;/p&gt;

&lt;p&gt;Free tier available on Apify. Try it at: &lt;a href="https://apify.com/george.the.developer/tiktok-shop-affiliate-sales-scraper" rel="noopener noreferrer"&gt;https://apify.com/george.the.developer/tiktok-shop-affiliate-sales-scraper&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;33 users already use this. Follow the build log at @ai_in_it on X.&lt;/p&gt;

</description>
      <category>marketing</category>
    </item>
    <item>
      <title>How to Build a RAG Pipeline from YouTube Videos (Without an API)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Thu, 16 Apr 2026 06:43:05 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/how-to-build-a-rag-pipeline-from-youtube-videos-without-an-api-4f73</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/how-to-build-a-rag-pipeline-from-youtube-videos-without-an-api-4f73</guid>
      <description>&lt;p&gt;If you validate emails with regex, you are checking if a string looks like an email. You are not checking if anyone will receive your message.&lt;/p&gt;

&lt;p&gt;I learned this the hard way. 12% bounce rate on a 5,000 email campaign. Sender reputation tanked. Half the "valid" emails were dead mailboxes that regex happily approved.&lt;/p&gt;

&lt;p&gt;here is what actually works and why the difference matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What regex checks
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells you the string has an @ symbol, a dot, and some characters around them. That is it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;totally.fake.person@nonexistent-domain-12345.com&lt;/code&gt; passes regex. Nobody will ever receive that email.&lt;/p&gt;

&lt;h2&gt;
  
  
  What SMTP verification checks
&lt;/h2&gt;

&lt;p&gt;SMTP verification actually talks to the mail server. It connects on port 25, introduces itself with EHLO, then asks "would you accept mail for this address?" The server responds with a code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;250 = yes, this mailbox exists&lt;/li&gt;
&lt;li&gt;550 = no, user unknown&lt;/li&gt;
&lt;li&gt;252 = catch all (accepts everything)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the same protocol your email client uses to send mail. You are just stopping before actually delivering anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5 layers of real validation
&lt;/h2&gt;

&lt;p&gt;I built an email validator that runs 5 checks in sequence:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: Format check.&lt;/strong&gt; Yes, regex. But just as a first filter to reject obvious garbage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Domain check.&lt;/strong&gt; Does the domain have MX records? If there are no mail servers configured, nobody is receiving email there. &lt;code&gt;dig MX example.com&lt;/code&gt; tells you instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Disposable detection.&lt;/strong&gt; Is this mailinator, guerrillamail, tempmail? I maintain a list of 400+ disposable domains. These addresses work for about 10 minutes then disappear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Role detection.&lt;/strong&gt; admin@, info@, support@ are role addresses. They usually go to a shared inbox that nobody monitors for cold outreach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 5: SMTP handshake.&lt;/strong&gt; The real check. Connect to the MX server, ask if the mailbox exists. This catches the dead addresses that everything else misses.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;Input: &lt;code&gt;hello@stripe.com&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hello@stripe.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"format_valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_found"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"smtp_check"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_disposable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_free"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_role_based"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stripe.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_records"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"priority"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"exchange"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aspmx.l.google.com"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Input: &lt;code&gt;test@mailinator.com&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test@mailinator.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"format_valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_found"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"smtp_check"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_disposable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Disposable email address"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The disposable check caught it before we even bothered with SMTP.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bounce rate difference
&lt;/h2&gt;

&lt;p&gt;Before SMTP validation: 12% bounce rate, emails landing in spam, sender score dropping.&lt;/p&gt;

&lt;p&gt;After: under 2% bounce rate. Same email copy, same sending infrastructure. The only change was filtering out dead addresses before hitting send.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Cost per 1,000 emails&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Regex only&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;~60% (misses dead mailboxes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroBounce&lt;/td&gt;
&lt;td&gt;$1.60&lt;/td&gt;
&lt;td&gt;~95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hunter.io&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;~93%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NeverBounce&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;~96%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;This API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~98% (real SMTP)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The validator is on Apify Store with a free tier: &lt;a href="https://apify.com/george.the.developer/email-validator-api" rel="noopener noreferrer"&gt;Email Validator API&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also available on &lt;a href="https://rapidapi.com/georgethedeveloper3046" rel="noopener noreferrer"&gt;RapidAPI&lt;/a&gt; if you prefer REST.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python quickstart
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;george.the.developer/email-validator-api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello@stripe.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;list_items&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Valid: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;valid&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  curl
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://george-the-developer--email-validator-api.apify.actor/validate?email=hello@stripe.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_TOKEN"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I build data tools. 57 actors on Apify Store, 869 users. Follow the build log at &lt;a href="https://x.com/ai_in_it" rel="noopener noreferrer"&gt;@ai_in_it on X&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>python</category>
    </item>
    <item>
      <title>Why Regex Email Validation Is Lying to You (And What Actually Works)</title>
      <dc:creator>George Kioko</dc:creator>
      <pubDate>Wed, 15 Apr 2026 09:34:37 +0000</pubDate>
      <link>https://dev.to/the_aientrepreneur_7ae85/why-regex-email-validation-is-lying-to-you-and-what-actually-works-451p</link>
      <guid>https://dev.to/the_aientrepreneur_7ae85/why-regex-email-validation-is-lying-to-you-and-what-actually-works-451p</guid>
      <description>&lt;p&gt;If you validate emails with regex, you are checking if a string looks like an email. You are not checking if anyone will receive your message.&lt;/p&gt;

&lt;p&gt;I learned this the hard way. 12% bounce rate on a 5,000 email campaign. Sender reputation tanked. Half the "valid" emails were dead mailboxes that regex happily approved.&lt;/p&gt;

&lt;p&gt;here is what actually works and why the difference matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  What regex checks
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This tells you the string has an @ symbol, a dot, and some characters around them. That is it.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;totally.fake.person@nonexistent-domain-12345.com&lt;/code&gt; passes regex. Nobody will ever receive that email.&lt;/p&gt;

&lt;h2&gt;
  
  
  What SMTP verification checks
&lt;/h2&gt;

&lt;p&gt;SMTP verification actually talks to the mail server. It connects on port 25, introduces itself with EHLO, then asks "would you accept mail for this address?" The server responds with a code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;250 = yes, this mailbox exists&lt;/li&gt;
&lt;li&gt;550 = no, user unknown&lt;/li&gt;
&lt;li&gt;252 = catch all (accepts everything)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the same protocol your email client uses to send mail. You are just stopping before actually delivering anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5 layers of real validation
&lt;/h2&gt;

&lt;p&gt;I built an email validator that runs 5 checks in sequence:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 1: Format check.&lt;/strong&gt; Yes, regex. But just as a first filter to reject obvious garbage.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 2: Domain check.&lt;/strong&gt; Does the domain have MX records? If there are no mail servers configured, nobody is receiving email there. &lt;code&gt;dig MX example.com&lt;/code&gt; tells you instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 3: Disposable detection.&lt;/strong&gt; Is this mailinator, guerrillamail, tempmail? I maintain a list of 400+ disposable domains. These addresses work for about 10 minutes then disappear.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 4: Role detection.&lt;/strong&gt; admin@, info@, support@ are role addresses. They usually go to a shared inbox that nobody monitors for cold outreach.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Layer 5: SMTP handshake.&lt;/strong&gt; The real check. Connect to the MX server, ask if the mailbox exists. This catches the dead addresses that everything else misses.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this looks like in practice
&lt;/h2&gt;

&lt;p&gt;Input: &lt;code&gt;hello@stripe.com&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hello@stripe.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"format_valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_found"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"smtp_check"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_disposable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_free"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_role_based"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stripe.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_records"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"priority"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"exchange"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"aspmx.l.google.com"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Input: &lt;code&gt;test@mailinator.com&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"email"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"test@mailinator.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"format_valid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mx_found"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"smtp_check"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"is_disposable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Disposable email address"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The disposable check caught it before we even bothered with SMTP.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bounce rate difference
&lt;/h2&gt;

&lt;p&gt;Before SMTP validation: 12% bounce rate, emails landing in spam, sender score dropping.&lt;/p&gt;

&lt;p&gt;After: under 2% bounce rate. Same email copy, same sending infrastructure. The only change was filtering out dead addresses before hitting send.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cost comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Cost per 1,000 emails&lt;/th&gt;
&lt;th&gt;Accuracy&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Regex only&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;~60% (misses dead mailboxes)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ZeroBounce&lt;/td&gt;
&lt;td&gt;$1.60&lt;/td&gt;
&lt;td&gt;~95%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Hunter.io&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;~93%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NeverBounce&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;~96%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;This API&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~98% (real SMTP)&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The validator is on Apify Store with a free tier: &lt;a href="https://apify.com/george.the.developer/email-validator-api" rel="noopener noreferrer"&gt;Email Validator API&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Also available on &lt;a href="https://rapidapi.com/georgethedeveloper3046" rel="noopener noreferrer"&gt;RapidAPI&lt;/a&gt; if you prefer REST.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python quickstart
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;george.the.developer/email-validator-api&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hello@stripe.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;list_items&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Valid: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;valid&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, Score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;score&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  curl
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://george-the-developer--email-validator-api.apify.actor/validate?email=hello@stripe.com"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_TOKEN"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I build data tools. 57 actors on Apify Store, 869 users. Follow the build log at &lt;a href="https://x.com/ai_in_it" rel="noopener noreferrer"&gt;@ai_in_it on X&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
