<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sami</title>
    <description>The latest articles on DEV Community by Sami (@sami_8858131362756585e4f4).</description>
    <link>https://dev.to/sami_8858131362756585e4f4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3877584%2F63d2c24c-ec4e-457f-8a71-2b79bb969554.png</url>
      <title>DEV Community: Sami</title>
      <link>https://dev.to/sami_8858131362756585e4f4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sami_8858131362756585e4f4"/>
    <language>en</language>
    <item>
      <title>Google Ads can spend up to 2x your daily budget. I built a Chrome extension that catches it before it happens.</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 30 Apr 2026 15:17:28 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/google-ads-can-spend-up-to-2x-your-daily-budget-i-built-a-chrome-extension-that-catches-it-before-j0</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/google-ads-can-spend-up-to-2x-your-daily-budget-i-built-a-chrome-extension-that-catches-it-before-j0</guid>
      <description>&lt;p&gt;If you've ever opened Google Ads and noticed your campaign spent way more than the daily budget you set, you're not imagining it. Google's documentation explicitly says they may spend up to &lt;strong&gt;twice your daily budget&lt;/strong&gt; on any given day, evening it out across the month. That's not a bug — it's how their pacing engine has always worked.&lt;/p&gt;

&lt;p&gt;What changed in March 2026: Google now aggressively targets &lt;strong&gt;100% of your monthly limit&lt;/strong&gt; — which is 30.4× your daily budget. Even with ad scheduling. So if your campaigns only run 22 days a month (weekdays only, for example), Google can push up to &lt;strong&gt;38% more spend per active day&lt;/strong&gt; than you'd expect from your daily budget setting.&lt;/p&gt;

&lt;p&gt;Most PPC managers don't notice until the damage is done. The Campaigns tab in Google Ads doesn't tell you whether you're on pace or headed for overspend. You'd need a spreadsheet, a calendar, and a calculator open in another window — or a SaaS tool that costs $49 to $749 per month.&lt;/p&gt;

&lt;p&gt;I got tired of the spreadsheet route. So I built a Chrome extension that does it inside Google Ads, in real time, for free up to 3 campaigns. Walking through the build because the technical approach is interesting and the pricing math vs SaaS tools is genuinely lopsided.&lt;/p&gt;

&lt;h2&gt;
  
  
  What budget pacing actually requires
&lt;/h2&gt;

&lt;p&gt;The math is simple. For each campaign:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;expected_spend_today = daily_budget × (days_elapsed_in_month / total_days_in_month)
pacing_ratio = actual_spend_today / expected_spend_today

# pacing_ratio &amp;lt; 1.10 → on pace
# pacing_ratio 1.10–1.20 → slight overspend
# pacing_ratio &amp;gt; 1.20 → overspend risk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole core logic. SaaS tools wrap this in dashboards, alerts, multi-account aggregation, and reporting. But the underlying calculation is six lines of code.&lt;/p&gt;

&lt;p&gt;The reason SaaS tools charge $49+/month isn't the math — it's the data plumbing. They connect to the Google Ads API (OAuth, refresh tokens, quota management), run server-side jobs to pull your accounts on a schedule, store results in a database, render charts. Real infrastructure cost.&lt;/p&gt;

&lt;p&gt;But here's the thing: &lt;strong&gt;your campaign data is already visible on your Google Ads screen&lt;/strong&gt;. Names, budgets, costs, statuses — the information is sitting in the DOM right there. If you're already looking at Google Ads, why does anyone need to call an API to tell you what you're already looking at?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Chrome extension approach
&lt;/h2&gt;

&lt;p&gt;I built AdPacer as a Manifest V3 Chrome extension that reads the campaign data from the Google Ads page DOM and overlays three pacing indicators directly on the interface you're already using. Architecture:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Content script&lt;/strong&gt; runs on &lt;code&gt;ads.google.com&lt;/code&gt; URLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MutationObserver&lt;/strong&gt; detects when the campaigns table renders or updates (Google Ads is a heavy SPA so this matters)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOM parsing&lt;/strong&gt; extracts campaign name, daily budget, current spend per row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pacing math&lt;/strong&gt; runs locally on the extracted values&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DOM injection&lt;/strong&gt; adds the colored pacing bars and projected-spend badges next to each campaign row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notifications API&lt;/strong&gt; for the periodic overspend checks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Zero API calls. Zero authentication flows. Zero backend. Zero data leaves the user's browser. Everything runs in the page's content-script context.&lt;/p&gt;

&lt;p&gt;The privacy implication is meaningful: AdPacer cannot exfiltrate your Google Ads data even if it wanted to. There's no network request to anywhere. SaaS tools, however privacy-conscious their privacy policies are, send your campaign data to their servers as a fundamental part of how they work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you actually see in Google Ads after install
&lt;/h2&gt;

&lt;p&gt;Three additions to the standard Campaigns tab:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Pacing bars&lt;/strong&gt; — a color-coded bar next to each campaign:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Green:&lt;/strong&gt; on pace, within 10% of expected spend&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Yellow:&lt;/strong&gt; ahead of pace, 10–20% over expected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Red:&lt;/strong&gt; overspend risk, 20%+ over expected&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Projected end-of-month spend&lt;/strong&gt; — a badge showing what your monthly spend will be if you continue at the current daily run rate. Updates as the page data updates. No spreadsheet required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Browser notifications&lt;/strong&gt; — when any campaign crosses your threshold (configurable from 10% to 25%). Checks automatically every 30 minutes. Catch problems early instead of at month-end reconciliation.&lt;/p&gt;

&lt;p&gt;That's it. Install, open Google Ads, see your pacing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing — and why this is structured the way it is
&lt;/h2&gt;

&lt;p&gt;I deliberately wanted to make this accessible to freelancers and small teams, not enterprise-priced.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;Limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;td&gt;Up to 3 campaigns. All core features. No credit card, no trial expiration.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pro&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$14/mo&lt;/td&gt;
&lt;td&gt;Unlimited campaigns, custom thresholds, priority support.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$29/mo&lt;/td&gt;
&lt;td&gt;Multi-account support, PDF pacing reports, team sharing.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The free tier covers most freelancers managing 1-3 client accounts at a time, or small e-commerce teams running a couple of brand/generic/shopping campaigns. Pro is for in-house PPC managers running 5-50 campaigns. Agency is for teams managing multiple clients.&lt;/p&gt;

&lt;p&gt;For comparison with the SaaS landscape:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Lowest tier&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;TrueClicks&lt;/td&gt;
&lt;td&gt;$49/mo&lt;/td&gt;
&lt;td&gt;Broader PPC management&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optmyzr&lt;/td&gt;
&lt;td&gt;$129/mo&lt;/td&gt;
&lt;td&gt;Optimization suite&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WordStream&lt;/td&gt;
&lt;td&gt;$299/mo+&lt;/td&gt;
&lt;td&gt;Enterprise tier&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AdPacer&lt;/td&gt;
&lt;td&gt;$0–$14/mo&lt;/td&gt;
&lt;td&gt;Pacing only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you need full PPC management — bid optimization, A/B testing, audience suggestions, the whole stack — the SaaS tools are doing a lot more than pacing. But if all you actually need is "tell me when a campaign is going to overspend," paying $49-299/month for that single feature is overkill.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who I built this for
&lt;/h2&gt;

&lt;p&gt;PPC managers running Google Ads daily who want instant budget visibility without context-switching to another tool. Freelancers managing 1-5 client accounts where SaaS pricing eats too much of the margin. Agency teams who need quick pacing checks across multiple campaigns. E-commerce advertisers watching ROAS and budget efficiency in real time.&lt;/p&gt;

&lt;p&gt;If you're an enterprise team running 200+ campaigns with complex bid strategies, this isn't for you — you probably already have an Optmyzr-class tool. If you're somewhere between "spreadsheet" and "expensive SaaS," this fills the gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do (yet)
&lt;/h2&gt;

&lt;p&gt;Being honest about scope:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No Microsoft Ads / Bing Ads support&lt;/strong&gt; yet (Google Ads only)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Meta / TikTok Ads&lt;/strong&gt; (different DOMs, different challenges, would be a separate extension)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No historical pacing trends&lt;/strong&gt; beyond current month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No bid suggestions or campaign optimization&lt;/strong&gt; (that's a different problem space)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Pacing is a single, focused use case. The extension does that one thing well rather than trying to be a half-decent everything tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Install link
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://chromewebstore.google.com/detail/adpacer-%E2%80%94-budget-pacing-f/mfgliiabejphemhkhlnapbebmkfhfjfm" rel="noopener noreferrer"&gt;AdPacer on the Chrome Web Store&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Free tier covers up to 3 campaigns with no credit card and no trial expiration — install and see if it solves your problem before paying anything.&lt;/p&gt;

&lt;p&gt;If you're a PPC manager and the spending pattern Google introduced in March 2026 has been causing you headaches, this is the lowest-friction way to catch overspend before it happens. If you're a developer reading this for the technical approach: yes, the entire thing runs client-side via DOM parsing — no API key, no backend, no data leaves the browser.&lt;/p&gt;

&lt;p&gt;Happy to answer questions about either side.&lt;/p&gt;

</description>
      <category>chrome</category>
      <category>marketing</category>
      <category>productivity</category>
      <category>javascript</category>
    </item>
    <item>
      <title>How to scrape Weibo (微博) data with Python in 2026 — the Sina Visitor System and how to handle it</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 30 Apr 2026 14:58:19 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-weibo-wei-bo-data-with-python-in-2026-the-sina-visitor-system-and-how-to-handle-it-1j6g</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-weibo-wei-bo-data-with-python-in-2026-the-sina-visitor-system-and-how-to-handle-it-1j6g</guid>
      <description>&lt;p&gt;Weibo is China's Twitter — the platform where Chinese public opinion forms, brand crises break first, and government statements land. 580M+ monthly active users, mostly mainstream demographics. If you're doing China market intelligence, brand monitoring, or PR analytics, Weibo is one of the platforms you can't skip.&lt;/p&gt;

&lt;p&gt;The challenge: Weibo's developer API requires a Chinese business license, has severe rate limits, and exposes very limited data. For Western teams, web scraping is the practical option. The interesting twist is Weibo's Sina Visitor System — an auth flow that makes anonymous access possible for some endpoints but not others. Understanding which is which matters for what you can actually scrape.&lt;/p&gt;

&lt;p&gt;This article covers the technical landscape (with real Python code) and points to a hosted scraper if you'd rather skip the maintenance.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Weibo serves
&lt;/h2&gt;

&lt;p&gt;A Weibo post is structured similarly to a tweet but with longer character limits and more structured engagement signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Post text&lt;/strong&gt; (140 to 2,000 characters depending on user level)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repost chain&lt;/strong&gt; — Weibo's quote-tweet equivalent, central to virality tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engagement metrics&lt;/strong&gt; — &lt;code&gt;attitudes_count&lt;/code&gt; (likes), &lt;code&gt;comments_count&lt;/code&gt;, &lt;code&gt;reposts_count&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hashtags and mentions&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geolocation&lt;/strong&gt; if disclosed by user&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author profile&lt;/strong&gt; — follower count, verification status, verified reason (e.g., "新浪科技 official Weibo")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Media&lt;/strong&gt; — images, videos&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A Weibo user profile gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User ID (numeric)&lt;/li&gt;
&lt;li&gt;Screen name (display name)&lt;/li&gt;
&lt;li&gt;Description / bio&lt;/li&gt;
&lt;li&gt;Followers / friends counts&lt;/li&gt;
&lt;li&gt;Statuses count (total posts)&lt;/li&gt;
&lt;li&gt;Verification status with reason text — this is gold for identifying official accounts vs personal vs corporate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For monitoring use cases, the metric that matters most depends on your goal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Crisis monitoring&lt;/strong&gt;: track &lt;code&gt;comments_count&lt;/code&gt; and repost velocity. A spike in either signals viral attention.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brand presence&lt;/strong&gt;: track post frequency from verified accounts in your category.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;KOL identification&lt;/strong&gt;: filter by &lt;code&gt;verified=true&lt;/code&gt; + follower count above a threshold.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Sina Visitor System
&lt;/h2&gt;

&lt;p&gt;This is the key technical concept for scraping Weibo without a Chinese business license.&lt;/p&gt;

&lt;p&gt;When you visit Weibo without logging in, Sina automatically issues you a "visitor cookie" via what they call the Sina Visitor System (SVS). This cookie lets you access limited public data — specifically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hot search / trending topics&lt;/strong&gt;: full access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post comments&lt;/strong&gt;: full access for any public post&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post viewing&lt;/strong&gt;: limited&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For these endpoints, scraping is straightforward — get a visitor cookie, hit the AJAX endpoint, parse JSON.&lt;/p&gt;

&lt;p&gt;What the visitor cookie does NOT give you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search by keyword&lt;/strong&gt; (returns hot timeline as a fallback instead of true search results)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User posts beyond profile basics&lt;/strong&gt; (you get the profile, not the user's post history)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For those, you need a real logged-in cookie — specifically the &lt;code&gt;SUB&lt;/code&gt; cookie value from a logged-in browser session. We'll get to that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Build it yourself
&lt;/h2&gt;

&lt;p&gt;The Sina Visitor System flow looks roughly like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;

&lt;span class="c1"&gt;# Step 1: Hit the visitor system to get a tid (temporary ID)
&lt;/span&gt;&lt;span class="n"&gt;visitor_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://passport.weibo.com/visitor/genvisitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;visitor_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gen_callback&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fp&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;os&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;browser&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Chrome&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fonts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;undefined&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;screenInfo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1920*1080*24&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;plugins&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="s"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# The response is a JSONP-wrapped JSON. Strip the wrapper, parse, extract tid.
&lt;/span&gt;
&lt;span class="c1"&gt;# Step 2: Use tid to get the SUB visitor cookie
&lt;/span&gt;&lt;span class="n"&gt;incarnate_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://passport.weibo.com/visitor/visitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;incarnate_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;incarnate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# Response sets cookies. Extract SUB and SUBP from response.cookies.
&lt;/span&gt;
&lt;span class="c1"&gt;# Step 3: Use those cookies to call AJAX endpoints
&lt;/span&gt;&lt;span class="n"&gt;hot_search_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://weibo.com/ajax/side/hotSearch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hot_search_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cookies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUBP&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;subp&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# data["data"]["realtime"] is the hot search list
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the rough shape. In practice you'll handle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate limit responses (HTTP 418, 429) with exponential backoff&lt;/li&gt;
&lt;li&gt;Cookie expiration (visitor cookies last hours, not days)&lt;/li&gt;
&lt;li&gt;AJAX endpoint changes (Weibo periodically reshuffles paths)&lt;/li&gt;
&lt;li&gt;Anti-scraping fingerprint checks (less aggressive than RedNote, but still present)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the keyword-search and user-posts endpoints, you'll need a real &lt;code&gt;SUB&lt;/code&gt; cookie from a logged-in account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Get SUB from your browser DevTools → Application → Cookies → weibo.com
# Look for the cookie named "SUB"
&lt;/span&gt;&lt;span class="n"&gt;sub_cookie&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB=_2A25Fxxxxxx...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://weibo.com/ajax/side/searchAll&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;q&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="n"&gt;cookies&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;sub_cookie&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cookies typically last several days before expiring, depending on Weibo's session policies.&lt;/p&gt;

&lt;h3&gt;
  
  
  DIY cost breakdown
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup (visitor system, hot search, comments)&lt;/td&gt;
&lt;td&gt;4-8 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User session cookie management&lt;/td&gt;
&lt;td&gt;1-2 hours/week&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maintenance when Weibo changes endpoints&lt;/td&gt;
&lt;td&gt;2-4 hours, every 2-3 months&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No proxy needed for most endpoints&lt;/td&gt;
&lt;td&gt;$0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Weibo is genuinely the easiest of the major Chinese platforms to scrape if you stay within visitor-system endpoints. RedNote and Bilibili both have more complex auth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: Use a hosted scraper
&lt;/h2&gt;

&lt;p&gt;If you don't want to maintain visitor-system handling and cookie management, the &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; Apify actor handles it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Hot search (no cookie needed)
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hot_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (heat: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotValue&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output format:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能最新突破"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"科技"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hotValue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2847562&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"labelName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"热"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isHot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://s.weibo.com/weibo?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-25T12:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For brand monitoring, search mode is what you want — though note the search-vs-cookie tradeoff:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Without cookie: returns hot timeline as fallback
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CeraVe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# With cookie: returns true keyword-matched results
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CeraVe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cookieString&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SUB=your_logged_in_cookie&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hosted actor handles the visitor system, exponential backoff, and rate limit recovery internally. Pricing: $5 per 1,000 results.&lt;/p&gt;

&lt;p&gt;Honest stats on the actor right now: 4 paying users, 11 free-tier users, 92.5% success rate, 3,768 result extractions to date. Average issue response time when something breaks: under a few hours.&lt;/p&gt;

&lt;h2&gt;
  
  
  When DIY vs hosted
&lt;/h2&gt;

&lt;p&gt;DIY makes sense when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're processing &amp;gt; 1M posts/month (per-result cost adds up)&lt;/li&gt;
&lt;li&gt;You have ops capacity to refresh &lt;code&gt;SUB&lt;/code&gt; cookies regularly&lt;/li&gt;
&lt;li&gt;You need to scrape behind login at scale&lt;/li&gt;
&lt;li&gt;You have specific endpoints not covered by hosted scrapers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hosted makes sense when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You don't have a dedicated scraper engineer&lt;/li&gt;
&lt;li&gt;Volume is moderate (&amp;lt; 500k posts/month)&lt;/li&gt;
&lt;li&gt;You want the visitor-system handling to be someone else's problem&lt;/li&gt;
&lt;li&gt;You're prototyping and want to validate the use case before committing&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What you do with the data downstream
&lt;/h2&gt;

&lt;p&gt;Sentiment analysis on Chinese text is the obvious next layer. Off-the-shelf Chinese BERT models work reasonably for Weibo's discourse style — Weibo posts tend to be more formal than RedNote slang, so general Chinese sentiment models accuracy is higher (typical 75-85% on neutral/positive/negative classification).&lt;/p&gt;

&lt;p&gt;For brand crisis detection, the signal you usually want is *&lt;em&gt;velocity&lt;/em&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Coin-per-view: the Bilibili metric that beats subscriber count for creator vetting</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 29 Apr 2026 17:48:32 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/coin-per-view-the-bilibili-metric-that-beats-subscriber-count-for-creator-vetting-477j</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/coin-per-view-the-bilibili-metric-that-beats-subscriber-count-for-creator-vetting-477j</guid>
      <description>&lt;p&gt;If you've ever sponsored a YouTube creator and been disappointed by the ROI, you've already lived through what subscriber count actually measures: not engagement, not influence, not purchase intent. Just historical clicks on a follow button. Many of those followers stopped opening videos two years ago. Some are inactive accounts. Some followed for a single piece of content that has nothing to do with your brand.&lt;/p&gt;

&lt;p&gt;This is universally true on creator platforms, but it's especially true on Bilibili — China's YouTube. With 300M+ monthly active users skewed Gen Z and millennials, Bilibili is where Chinese creator marketing happens. And Bilibili exposes three engagement signals that YouTube doesn't, which together let you cut through the noise of follower counts and identify creators whose audiences actually engage.&lt;/p&gt;

&lt;p&gt;The single most useful one is &lt;strong&gt;coin-per-view ratio&lt;/strong&gt;. This post explains what it is, why it matters, what threshold to use, and how to compute it for any Chinese creator in a few lines of code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why follower count is a lying signal
&lt;/h2&gt;

&lt;p&gt;Three reasons follower counts mislead in creator marketing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Followers are a lagging indicator.&lt;/strong&gt; Someone followed a creator in 2023 because they liked one video. That doesn't tell you whether they still watch in 2026, whether they engage, or whether they trust the creator's recommendations enough to buy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Followers are gameable.&lt;/strong&gt; Not everyone games them, but enough creators do that you can't trust raw counts without other signals. Bot followers, follow-for-follow campaigns, paid follower services. China specifically has a robust market for these.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. The follower-to-engagement ratio varies wildly.&lt;/strong&gt; A creator with 100k followers and 1M average views per video has fundamentally different audience economics than another creator with 100k followers and 5k average views per video. Both have the same "follower count" — the engagement quality is the actual signal.&lt;/p&gt;

&lt;p&gt;This is why every serious creator marketing tool talks about "engagement rate" — which on YouTube is usually computed as (likes + comments) / views. It's better than raw follower count, but on Bilibili you can do meaningfully better.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three Bilibili-native metrics
&lt;/h2&gt;

&lt;p&gt;Bilibili was designed by anime fans for anime fans, and the engagement system reflects values around quality and creator support that YouTube's flat "like" button never captured. Three metrics that come back from any Bilibili video scrape:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Danmaku (弹幕)&lt;/strong&gt; — real-time scrolling comments overlaid on the video as users watch. Think livestream chat, but for pre-recorded video. The danmaku count tells you how many people were engaged enough mid-watch to type something. It's a leading indicator of viewing time and attention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Favorites (收藏)&lt;/strong&gt; — equivalent to "save for later" or YouTube's bookmark. Strong long-term value signal: high favorites relative to views means people return to this video. Tutorials, references, and definitive content score high here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Coins (投币)&lt;/strong&gt; — Bilibili's tipping system. This is the interesting one. Each user gets a small daily allocation of coins (typically 5 per day for active users), and they can "throw" them at videos they want to support. Because coins are scarce by design — you only have a few to spend, ever — coin counts are a strong genuine-appreciation signal.&lt;/p&gt;

&lt;p&gt;A user gives a coin to a video they love. They give a coin to a creator they want to keep making content. They don't give a coin to a video they passively watched and forgot. The cost is real (relative to the user's daily allocation), so the signal is real.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coin-per-view ratio: the single best signal
&lt;/h2&gt;

&lt;p&gt;If I had to pick one metric to evaluate a Bilibili creator, it would be &lt;strong&gt;median coin-per-view ratio across their last 20-30 videos&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The math is simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;coin_per_view&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;coin_count&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;view_count&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="c1"&gt;# express as percentage
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What I've found from looking at hundreds of Bilibili creators across categories:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Coin/View %&lt;/th&gt;
&lt;th&gt;Audience quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; 0.5%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Passive viewers. Casual scrolling traffic, not engaged.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;0.5% – 1%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Average. Normal Bilibili content, decent audience.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;1% – 2%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strong. Genuinely engaged audience. Worth sponsoring.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;gt; 2%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Exceptional. Users actively spending limited resources on this content.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Above 2% is rare. It typically indicates either: (a) genuinely high-quality educational/tutorial content that people return to, (b) a creator with a deeply loyal niche audience, or (c) content that struck a strong emotional/cultural nerve.&lt;/p&gt;

&lt;p&gt;For creator vetting, my heuristic is: &lt;strong&gt;if median coin-per-view is below 1%, the audience is more passive than the follower count suggests; sponsorship ROI will probably disappoint.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  How to compute this for any creator
&lt;/h2&gt;

&lt;p&gt;The data you need: a creator's recent videos with their view and coin counts. Bilibili exposes this through their public API — no auth required. You can use the open-source &lt;code&gt;bilibili-api&lt;/code&gt; Python library, or call their &lt;code&gt;/x/space/wbi/arc/search&lt;/code&gt; endpoint directly.&lt;/p&gt;

&lt;p&gt;If you'd rather skip the API integration entirely, I built a hosted scraper on Apify Store: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;. $5 per 1,000 results, free tier covers ~1,000 results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;statistics&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;median&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Get a creator's last 30 videos
# user_id (mid) is the number in their profile URL: space.bilibili.com/{mid}
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user_videos&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;userIds&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;546195&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;   &lt;span class="c1"&gt;# 老番茄 (a well-known Bilibili gamer)
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="n"&gt;videos&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;video&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;videos&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Compute coin-per-view ratio per video
&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;videos&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# skip videos with too few views to be meaningful
&lt;/span&gt;        &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="n"&gt;coins&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coinCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;coins&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;views&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Videos analyzed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Median coin-per-view: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Best video coin-per-view: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;%&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Categorize
&lt;/span&gt;&lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ratios&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EXCEPTIONAL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;STRONG&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;median_ratio&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AVERAGE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;label&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PASSIVE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Audience quality: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;label&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this against any Bilibili creator's user ID and you have a concrete answer about audience engagement quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  A workflow for vetting creators at scale
&lt;/h2&gt;

&lt;p&gt;If you're building a creator marketing program for the Chinese market, the workflow that works for the teams I've seen using this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Gather candidates.&lt;/strong&gt; From competitor sponsorship lists, from category trending, or from agency recommendations. Aim for 30-50 candidates per round.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pull their recent video portfolios.&lt;/strong&gt; Use &lt;code&gt;user_videos&lt;/code&gt; mode to get the last 20-30 videos per creator.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compute aggregate metrics.&lt;/strong&gt; For each creator: median coin-per-view, median favorite-per-view, median danmaku-per-view, view consistency (standard deviation).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter on quality threshold.&lt;/strong&gt; Drop anyone with median coin-per-view below 1%. This usually cuts the candidate list by 40-60%.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual review of the survivors.&lt;/strong&gt; Watch a sample of their videos. Check for content fit. Evaluate sponsorship history (do their sponsored posts feel native or forced?).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Negotiate from the qualified shortlist.&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Total cost using a hosted scraper: ~$5-10 in scraping for a 50-creator vetting round. Compared to agency rates for the same work ($500-2000 per round), the math is obvious once you do it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-platform creator vetting
&lt;/h2&gt;

&lt;p&gt;Bilibili is not the whole story. If you're vetting creators for a comprehensive China presence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili&lt;/strong&gt; for video content (gaming, tech, anime, education)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote (Xiaohongshu)&lt;/strong&gt; for product-discovery content (beauty, fashion, lifestyle, food)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo&lt;/strong&gt; for public discourse and broad reach campaigns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each platform has different engagement signals. Bilibili has coins; RedNote has saves (similarly scarce intent-to-buy signal); Weibo has reposts and verified-account hierarchy. A creator strong on one isn't necessarily strong on others.&lt;/p&gt;

&lt;p&gt;I maintain scrapers for all three on Apify Store under the &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;zhorex profile&lt;/a&gt;, with consistent output schemas across the suite. Same pricing model ($5/1000 results), same Apify infrastructure. If you're doing cross-platform creator analytics, the consistency saves integration time.&lt;/p&gt;

&lt;h2&gt;
  
  
  When this approach fails
&lt;/h2&gt;

&lt;p&gt;Two cases where coin-per-view ratio is a misleading signal:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Brand-new creators with very few videos.&lt;/strong&gt; If a creator has uploaded 3 videos and one went viral with high coins, the ratio looks artificial. Wait until you have 15-20 videos to compute median.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Live-stream-focused creators.&lt;/strong&gt; Bilibili lets creators upload archived live streams. Coin economics are different in livestream context (gifts replace coins). For livestream-heavy creators, you need different analysis.&lt;/p&gt;

&lt;p&gt;For everyone else, coin-per-view ratio is the single best signal I've found for vetting Bilibili creator quality at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this won't tell you
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Whether the audience is geographically right for your campaign (need follower demographics, which require auth)&lt;/li&gt;
&lt;li&gt;Whether the creator has done sponsorships before that flopped (need to scrape their content for promo patterns)&lt;/li&gt;
&lt;li&gt;Whether their audience overlaps with your target customer profile (need cross-reference with other platforms)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Treat coin-per-view as the engagement-quality filter. Everything else still requires manual review or additional data sources.&lt;/p&gt;




&lt;p&gt;If you're working on creator marketing for the Chinese market and want to compare notes on what works — drop a comment. I write about Chinese platform analytics (Bilibili, RedNote, Weibo) and the build-vs-buy tradeoffs around them.&lt;/p&gt;

&lt;p&gt;Hosted Bilibili scraper: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/bilibili-scraper&lt;/a&gt;&lt;br&gt;
Other Chinese platform scrapers in the same suite: &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote&lt;/a&gt; for product-discovery content, &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;Weibo&lt;/a&gt; for public discourse.&lt;/p&gt;

</description>
      <category>china</category>
      <category>marketing</category>
      <category>datascience</category>
      <category>analytics</category>
    </item>
    <item>
      <title>The easiest Chinese platform to scrape in Python in 2026: Bilibili in under 30 lines</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 29 Apr 2026 17:41:42 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/the-easiest-chinese-platform-to-scrape-in-python-in-2026-bilibili-in-under-30-lines-5216</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/the-easiest-chinese-platform-to-scrape-in-python-in-2026-bilibili-in-under-30-lines-5216</guid>
      <description>&lt;p&gt;If you've ever tried to scrape RedNote (Xiaohongshu), you know the pain. Request signing that rotates monthly, TLS fingerprinting that blocks &lt;code&gt;requests&lt;/code&gt; immediately, residential proxies required, the whole tour. Weibo isn't quite as bad but you still need the Sina Visitor System dance to even hit a public endpoint.&lt;/p&gt;

&lt;p&gt;Bilibili is the outlier. No API key. No browser. No proxy. No request signing rotation worth worrying about. Pure HTTP. Runs in 256MB RAM.&lt;/p&gt;

&lt;p&gt;If you only need to monitor one Chinese platform — say you're tracking a gaming brand launch, or doing creator research, or analyzing tech trends — Bilibili is where you should start. This post walks through how to scrape it from scratch in Python, what data you actually get, and when to switch from DIY to a hosted scraper.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Bilibili is unusually scrape-friendly
&lt;/h2&gt;

&lt;p&gt;Bilibili (哔哩哔哩) is China's YouTube — 300M+ monthly active users, skewed Gen Z and millennials, dominant in anime, gaming, tech, and educational content. From a scraping perspective, three things make it different from RedNote and Weibo:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Internal HTTP APIs are mostly stable.&lt;/strong&gt; Bilibili exposes JSON endpoints for search, video metadata, user info, popular/trending, and comments. Most don't require auth for public content. The endpoints change rarely (months between meaningful updates) compared to RedNote where signing rotates monthly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. No TLS fingerprinting.&lt;/strong&gt; Plain &lt;code&gt;httpx&lt;/code&gt; or &lt;code&gt;requests&lt;/code&gt; works. You don't need &lt;code&gt;curl_cffi&lt;/code&gt; or any Chrome impersonation library to get past anti-bot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Generous rate limits from non-Chinese IPs.&lt;/strong&gt; I've sustained 1-2 requests per second from a single datacenter IP without getting throttled. RedNote would have banned the IP within minutes at that rate.&lt;/p&gt;

&lt;p&gt;There's one caveat I'll cover at the end: &lt;strong&gt;comments scraping is throttled from datacenter IPs&lt;/strong&gt;. Everything else works fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  A 30-line Bilibili scraper
&lt;/h2&gt;

&lt;p&gt;Here's the smallest useful Bilibili scraper I'd actually run in production:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;

&lt;span class="n"&gt;HEADERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User-Agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Referer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.bilibili.com/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_video_detail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bvid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch full metadata for a Bilibili video by BVID.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.bilibili.com/x/web-interface/view&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bvid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;bvid&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_popular&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category_rid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Fetch trending videos. category_rid=0 means all categories.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.bilibili.com/x/web-interface/popular&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ps&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;page_size&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;category_rid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rid&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;category_rid&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;list&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;


&lt;span class="c1"&gt;# Use it
&lt;/span&gt;&lt;span class="n"&gt;trending&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_popular&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;trending&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;stat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stat&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;  Views: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;view&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | Coins: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;coin&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
          &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Favorites: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;favorite&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; | Danmaku: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;stat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;danmaku&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. No auth, no proxy, no signing. You can paste this into a Python REPL and have working data in 5 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The five things you can scrape
&lt;/h2&gt;

&lt;p&gt;Bilibili's web client exposes endpoints for five modes. Each is useful for different use cases:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Endpoint&lt;/th&gt;
&lt;th&gt;Use case&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/x/web-interface/wbi/search/type&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Find videos by keyword (Chinese or English). Filterable by sort order, duration, date range.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video detail&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/x/web-interface/view&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Full metadata for a specific video including all engagement metrics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Comments&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/x/v2/reply/main&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Comments on a video (caveats below)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User videos&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/x/space/wbi/arc/search&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;All recent uploads from a creator&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Popular&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/x/web-interface/popular&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Trending feed, optionally filtered by category&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The search endpoint requires a small WBI signing scheme — slightly more involved than the others. Open-source library &lt;code&gt;bilibili-api&lt;/code&gt; on GitHub handles it if you don't want to reverse-engineer it yourself. For everything else, plain HTTP works.&lt;/p&gt;

&lt;h2&gt;
  
  
  What data you actually get back
&lt;/h2&gt;

&lt;p&gt;Each video from Bilibili comes with three categories of engagement metrics:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standard metrics&lt;/strong&gt; (similar to YouTube):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;View count&lt;/li&gt;
&lt;li&gt;Like count&lt;/li&gt;
&lt;li&gt;Share count&lt;/li&gt;
&lt;li&gt;Reply count (comments)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Bilibili-specific metrics&lt;/strong&gt; (these don't exist on YouTube):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Danmaku count&lt;/strong&gt; (弹幕) — the count of real-time scrolling comments overlaid on the video as users watch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coin count&lt;/strong&gt; (投币) — Bilibili's tipping system. Users get a few coins per day and "throw" them at videos. Coins are scarce by design.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Favorite count&lt;/strong&gt; (收藏) — equivalent to "save for later"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These extra metrics are non-trivial. A video with 1M views, 50k coins, 30k favorites is meaningfully different from 1M views, 1k coins, 5k favorites — even with identical view counts. The first has an audience that's actively engaged enough to spend their daily coin allocation; the second has passive viewers.&lt;/p&gt;

&lt;p&gt;If you're doing creator analytics or content strategy, the engagement quality signals coins/favorites give you are worth the integration effort.&lt;/p&gt;

&lt;h2&gt;
  
  
  The comments caveat
&lt;/h2&gt;

&lt;p&gt;There's one Bilibili-side restriction worth knowing: &lt;code&gt;/x/v2/reply/main&lt;/code&gt; is throttled when called from datacenter IPs (AWS, GCP, Azure, etc.). You'll get the top ~3 pinned comments per video and then nothing. Full pagination requires either:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Authenticated session cookies, or&lt;/li&gt;
&lt;li&gt;Residential IPs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you need full comments at scale, this is the one part where you'll run into infrastructure cost. Other modes are unaffected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance you can expect
&lt;/h2&gt;

&lt;p&gt;Real numbers from running this on Apify's serverless infrastructure (256MB RAM, no proxy, no auth):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Input&lt;/th&gt;
&lt;th&gt;Duration&lt;/th&gt;
&lt;th&gt;Throughput&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;max=50&lt;/td&gt;
&lt;td&gt;~7-8 seconds&lt;/td&gt;
&lt;td&gt;~7 items/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Popular&lt;/td&gt;
&lt;td&gt;max=40&lt;/td&gt;
&lt;td&gt;~5 seconds&lt;/td&gt;
&lt;td&gt;~8 items/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Video detail&lt;/td&gt;
&lt;td&gt;10 BVIDs&lt;/td&gt;
&lt;td&gt;~5 seconds&lt;/td&gt;
&lt;td&gt;~2 items/sec (with tag enrichment)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;User videos&lt;/td&gt;
&lt;td&gt;3 users, max=15&lt;/td&gt;
&lt;td&gt;~4 seconds&lt;/td&gt;
&lt;td&gt;~4 items/sec&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Fast enough for monitoring use cases. For heavy bulk extraction (millions of videos) you'd want to parallelize across multiple workers — easy because there's no IP throttling on most endpoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  DIY vs hosted
&lt;/h2&gt;

&lt;p&gt;DIY is genuinely viable for Bilibili. The maintenance burden is low — endpoints don't change much, there's no signing rotation eating your time. If you only need Bilibili and you're already comfortable with Python HTTP, build it yourself.&lt;/p&gt;

&lt;p&gt;But if you're already monitoring multiple Chinese platforms (RedNote + Weibo + Bilibili is the typical full-suite use case), a hosted scraper that handles all three with consistent output schema is worth the convenience. I built and maintain one on Apify Store: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;. Honest current state:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;16 users&lt;/li&gt;
&lt;li&gt;9 monthly active&lt;/li&gt;
&lt;li&gt;100% success rate over 2,635 result extractions&lt;/li&gt;
&lt;li&gt;Average issue response: 4.4 hours&lt;/li&gt;
&lt;li&gt;$5 per 1,000 results (Apify free tier covers ~1,000/month at no cost)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It supports all five modes above with consistent output JSON, handles WBI signing for search internally, and is paired with companion scrapers for the other Chinese platforms — &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote (Xiaohongshu)&lt;/a&gt; and &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;Weibo&lt;/a&gt;. Same pricing model across the suite.&lt;/p&gt;

&lt;p&gt;Using it from Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI tutorial&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sortOrder&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;click&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="c1"&gt;# most-viewed first
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;durationFilter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;medium&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;# 10-30 min videos
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; — &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; views, &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;coinCount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; coins&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  When to use Bilibili specifically
&lt;/h2&gt;

&lt;p&gt;Even if you're already covering YouTube, Twitter, etc., Bilibili captures things those platforms don't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chinese gaming and esports content&lt;/strong&gt; — game launches, walkthroughs, and esports event reactions live here&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tutorial and educational content&lt;/strong&gt; — Knowledge category is huge; replaces YouTube tutorials for Chinese audiences&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anime and otaku culture&lt;/strong&gt; — central hub for the Chinese anime community&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creator economy in China&lt;/strong&gt; — the coin/favorite metrics give better creator quality signals than subscriber counts (more on that in a follow-up post)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If your use case touches Chinese-market gaming, anime, tech, or education, Bilibili is irreplaceable.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Bilibili legal?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Public-data scraping legality varies by jurisdiction. Bilibili's ToS prohibits automated access. The scraping approach treats public web pages as accessible (the same content any logged-out browser visitor can see). Consult legal counsel for your specific use case. Not legal advice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a Chinese IP?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For most endpoints, no. Bilibili is globally accessible. The exception is comments scraping (covered above). Some licensed video content (anime, dramas) may be geo-restricted but metadata and engagement metrics are accessible from any IP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's a BVID?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Bilibili's video ID format. Looks like &lt;code&gt;BV1YXDfBUETP&lt;/code&gt;. Replaced the older numeric &lt;code&gt;aid&lt;/code&gt; format. URLs use BVIDs: &lt;code&gt;bilibili.com/video/BV1YXDfBUETP&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How does this compare to YouTube Data API?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;YouTube has standardized API access for verified channel owners with quota limits. Bilibili has no equivalent for international developers. For Chinese-market analytics, scraping is the only practical option. The upside: more granular engagement signals (coins, favorites, danmaku) than YouTube exposes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why did &lt;code&gt;bilibili-api&lt;/code&gt; (the Python lib) work for me a year ago and break now?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Probably the WBI signing scheme rotated. Check the library's commit history; if there's been a recent fix, update. If the project looks abandoned, consider switching libraries or going hosted.&lt;/p&gt;




&lt;p&gt;If you're working on Chinese-market analytics, brand monitoring, or creator research and want to compare notes — drop a comment. I write about the build-vs-buy tradeoffs for Chinese platform scraping (RedNote, Weibo, Bilibili).&lt;/p&gt;

&lt;p&gt;Hosted actor: &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/bilibili-scraper&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>How to scrape RedNote (Xiaohongshu) with Python in 2026 — the auth/signing problem and how to handle it</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Sat, 25 Apr 2026 01:13:23 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-rednote-xiaohongshu-with-python-in-2026-the-authsigning-problem-and-how-to-3f9e</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-rednote-xiaohongshu-with-python-in-2026-the-authsigning-problem-and-how-to-3f9e</guid>
      <description>&lt;p&gt;RedNote (Xiaohongshu, 小红书, sometimes "Little Red Book" or just XHS) is the platform a lot of Western teams realized they needed to monitor in 2024-2025, when the TikTok regulatory mess in the US sent millions of users — and brand attention — toward Chinese platforms. It's now China's #1 lifestyle and product-discovery network, with 300M+ monthly active users and a search-driven discovery model that makes it different from every other Chinese social platform.&lt;/p&gt;

&lt;p&gt;The problem: there's no official public API. Western teams who try to monitor it usually end up either (a) paying enterprise vendors $20-50k/year for limited China coverage, or (b) trying to scrape it themselves and discovering that RedNote has one of the more aggressive anti-scraping stacks in Chinese social.&lt;/p&gt;

&lt;p&gt;This article walks through the actual technical challenges and shows you both DIY and hosted approaches with real Python code. I've shipped a hosted RedNote scraper on Apify that I'll mention later — but the goal here is for you to understand the problem space well enough to make an informed build-vs-buy decision, not to sell you anything.&lt;/p&gt;

&lt;h2&gt;
  
  
  What RedNote actually serves
&lt;/h2&gt;

&lt;p&gt;Before we go technical: what data does RedNote expose, and what's actually useful?&lt;/p&gt;

&lt;p&gt;A RedNote post is structured roughly like this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Title&lt;/strong&gt; (often very short, sometimes empty)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Body text&lt;/strong&gt; — long-form description with product mentions, hashtags, location tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image carousel&lt;/strong&gt; — 1-9 images. Critical: a non-trivial portion of product info lives in image text overlays, not in the body&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Engagement metrics&lt;/strong&gt; — likes, saves, comments, shares&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Author profile&lt;/strong&gt; — username, avatar, follower/following counts, bio, verification, location&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tags / categories&lt;/strong&gt; — hashtags and platform-assigned categories&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For most monitoring use cases, the metric that matters more than likes is &lt;strong&gt;saves&lt;/strong&gt;. Saves on RedNote are the closest equivalent to "I want to buy this later" — they correlate with purchase intent. Likes on RedNote are casual engagement, similar to Twitter likes.&lt;/p&gt;

&lt;p&gt;Profile data is structured similarly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User ID, RedNote ID (red ID), nickname, avatar&lt;/li&gt;
&lt;li&gt;Bio / description&lt;/li&gt;
&lt;li&gt;Follower / following counts&lt;/li&gt;
&lt;li&gt;Location, gender, profile tags&lt;/li&gt;
&lt;li&gt;Total likes received across all posts&lt;/li&gt;
&lt;li&gt;Verification status&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The technical challenges (why this is harder than scraping Twitter)
&lt;/h2&gt;

&lt;p&gt;If you've scraped Western social platforms, your default toolkit is probably &lt;code&gt;httpx&lt;/code&gt; or &lt;code&gt;requests&lt;/code&gt; plus maybe a residential proxy. RedNote is going to break each of those defaults.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 1: TLS fingerprinting
&lt;/h3&gt;

&lt;p&gt;RedNote uses TLS fingerprinting (specifically JA3/JA4) to identify and block requests that don't come from real browsers. The &lt;code&gt;requests&lt;/code&gt; library has a Python-specific TLS fingerprint that RedNote's bot-detection layer recognizes immediately.&lt;/p&gt;

&lt;p&gt;The standard fix is to use &lt;code&gt;curl_cffi&lt;/code&gt;, which lets you spoof a Chrome or Safari TLS fingerprint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;curl_cffi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;curl_requests&lt;/span&gt;

&lt;span class="c1"&gt;# Spoof Chrome 120's TLS fingerprint
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;curl_requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://www.xiaohongshu.com/explore&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;impersonate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chrome120&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This alone gets you past the first layer of detection.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 2: Request signing
&lt;/h3&gt;

&lt;p&gt;RedNote signs every API request with a value called &lt;code&gt;x-s&lt;/code&gt; (sometimes seen as &lt;code&gt;xs&lt;/code&gt;) plus other parameters like &lt;code&gt;x-t&lt;/code&gt; and &lt;code&gt;x-s-common&lt;/code&gt;. These are computed client-side from a JavaScript function in their web app.&lt;/p&gt;

&lt;p&gt;The signing function changes roughly monthly. When it changes, every scraper using the old signing logic breaks until someone reverse-engineers the new function.&lt;/p&gt;

&lt;p&gt;Here's roughly what you need to do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Pseudo-code — actual signing logic is more complex
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_signing_headers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    The actual logic is reverse-engineered from RedNote&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s web client JS.
    This requires reading their obfuscated bundle and reproducing it in Python.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;timestamp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;body_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;separators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;

    &lt;span class="c1"&gt;# Real implementation involves:
&lt;/span&gt;    &lt;span class="c1"&gt;# - Specific input concatenation order
&lt;/span&gt;    &lt;span class="c1"&gt;# - Custom hashing scheme (not standard HMAC)
&lt;/span&gt;    &lt;span class="c1"&gt;# - Several "magic constants" that change when RedNote rotates
&lt;/span&gt;    &lt;span class="c1"&gt;# - Sometimes a captcha-derived token
&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url_path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;data=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;body_str&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;&amp;amp;t=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;x_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;md5&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# This is NOT the actual algorithm
&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x_s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x-t&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="c1"&gt;# x-s-common is computed separately
&lt;/span&gt;    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The actual signing algorithm is more complex than what I've shown. There are open-source libraries that have reverse-engineered it (&lt;code&gt;xhs-api&lt;/code&gt; and similar on GitHub) — they get you most of the way there, but expect to patch them when RedNote rotates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 3: IP-level rate limiting and datacenter blocking
&lt;/h3&gt;

&lt;p&gt;RedNote blocks requests from datacenter IPs (AWS, GCP, Azure, DigitalOcean, etc.) within minutes. You need residential proxies, ideally with Chinese geolocation or at least Asia-Pacific.&lt;/p&gt;

&lt;p&gt;Even with residential IPs, there's a per-IP rate limit. Realistic throughput is around 10-20 requests per minute per IP before you start getting 412/418 errors and eventually IP bans.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 4: SPA / dynamic rendering for some endpoints
&lt;/h3&gt;

&lt;p&gt;Search and the explore feed are loaded via AJAX after initial page load, but a few endpoints (some user pages, certain post types) only render their data in the Vue.js application state. You either need to extract data from the inline &lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt; tag (look for &lt;code&gt;window.__INITIAL_STATE__&lt;/code&gt;) or render with Playwright.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 5: Login walls on certain features
&lt;/h3&gt;

&lt;p&gt;True keyword-filtered search requires login. Without login, you get the explore feed (trending/recommended for your keyword), which is useful but not the same. This is a structural product limitation, not a scraping limitation — you can scrape it the same way logged-in users see it, you just need to either provide cookies or accept the explore-feed fallback.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 1: Build it yourself
&lt;/h2&gt;

&lt;p&gt;If you have ops capacity to maintain it (someone who can read JavaScript and reverse-engineer signing functions monthly), DIY is feasible. Here's a minimal example using &lt;code&gt;curl_cffi&lt;/code&gt; plus an open-source signing library.&lt;/p&gt;

&lt;p&gt;First install:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;curl_cffi xhs
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note: &lt;code&gt;xhs&lt;/code&gt; is one of several open-source libraries on GitHub that wrap RedNote's API. Check their commit history before depending on one — the abandoned ones break monthly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;xhs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;XhsClient&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;xhs.exceptions&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DataFetchError&lt;/span&gt;

&lt;span class="c1"&gt;# You need to provide your own cookies and signing function URL
# The 'sign' function comes from the library's reverse-engineered JS
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;a1&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;web_session&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation provided by the library
&lt;/span&gt;    &lt;span class="c1"&gt;# When RedNote rotates, you'll need to update this
&lt;/span&gt;    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;XhsClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;cookie&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;abRequestId=...; webBuild=...; xsecappid=xhs-pc-web; ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;sign&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;sign&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Get a user's posts
&lt;/span&gt;    &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5cfbc3f10000000018023ebb&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_notes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;posts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;display_title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likes: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;interact_info&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="si"&gt;{}&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;liked_count&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;DataFetchError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;RedNote rejected the request: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Common causes:
&lt;/span&gt;    &lt;span class="c1"&gt;# - Signing function out of date (update from upstream)
&lt;/span&gt;    &lt;span class="c1"&gt;# - Cookie expired (re-login)
&lt;/span&gt;    &lt;span class="c1"&gt;# - IP throttled (rotate proxy)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The cookie you need comes from logging into RedNote in a browser and copying the relevant cookies from DevTools. The cookies expire — typically a few days — so you'll need to refresh them periodically.&lt;/p&gt;

&lt;p&gt;Here's the honest cost breakdown for DIY:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Cost component&lt;/th&gt;
&lt;th&gt;Estimate&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Initial setup (researching libraries, getting first scrape working)&lt;/td&gt;
&lt;td&gt;8-16 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Residential proxy (Bright Data, Oxylabs, etc.)&lt;/td&gt;
&lt;td&gt;$50-200/month for moderate volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Per-incident maintenance when RedNote rotates&lt;/td&gt;
&lt;td&gt;4-8 hours, 1-2x/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ongoing: cookie refresh, error handling&lt;/td&gt;
&lt;td&gt;1-2 hours/week&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If you have a developer whose time is worth $50-100/hour, DIY is around $400-1000/month all-in for moderate scraping volumes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Approach 2: Use a hosted scraper
&lt;/h2&gt;

&lt;p&gt;The build-vs-buy math changes if you don't have someone on the team who can read reverse-engineered JavaScript and patch signing logic. Hosted Apify Actors handle that for you.&lt;/p&gt;

&lt;p&gt;Several developers (including me) maintain RedNote scrapers on Apify. Mine is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-xiaohongshu-scraper&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;$5 per 1,000 results&lt;/li&gt;
&lt;li&gt;14 paying users currently, 38 on free tier&lt;/li&gt;
&lt;li&gt;Average issue response time: 1.6 hours&lt;/li&gt;
&lt;li&gt;88.8% success rate (the gap is mostly RedNote-side transient errors)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Using it from Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_APIFY_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Search
&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/rednote-xiaohongshu-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;skincare routine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;filterByMinLikes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;  &lt;span class="c1"&gt;# Only return posts with 100+ likes
&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Iterate over results
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Title: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Likes: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;likes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Author: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;author&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;nickname&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;URL: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;post&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;postUrl&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;---&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output JSON is flat:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"69d269310000000023017e07"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.xiaohongshu.com/explore/69d269310000000023017e07"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"normal"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Morning skincare routine for dry skin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"images"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://sns-webpic-qc.xhscdn.com/..."&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"likes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;15234&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"author"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"userId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"575d32285e87e733f0162c0a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"nickname"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BeautyQueen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"avatar"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://sns-avatar-qc.xhscdn.com/..."&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-25T21:14:30Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is one option among several on Apify Store. EasyApi has the most users by volume; OrbitData Labs has a different all-in-one approach. Pricing is roughly the same across them ($5/1000 ± $1). Differences are in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Output schema (some return RedNote's raw nested API response, some flatten it)&lt;/li&gt;
&lt;li&gt;Update frequency (some are abandoned and break for weeks at a time)&lt;/li&gt;
&lt;li&gt;Mode coverage (some only do search; others handle profiles, comments, videos, etc.)&lt;/li&gt;
&lt;li&gt;Issue response time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're evaluating, run a free-tier test on 2-3 of them with the same input and compare what you get back. The free tier costs you nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  When does each approach make sense?
&lt;/h2&gt;

&lt;p&gt;DIY (build it yourself):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have a Chinese-language ops team and can monitor breakages&lt;/li&gt;
&lt;li&gt;You're processing &amp;gt; 1M posts/month (the per-result cost of hosted starts to add up)&lt;/li&gt;
&lt;li&gt;You need to scrape behind login (which means you need cookies from logged-in accounts you control)&lt;/li&gt;
&lt;li&gt;You have specific data needs that no hosted scraper covers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hosted Apify actor:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You don't have a dedicated scraper engineer&lt;/li&gt;
&lt;li&gt;Volume is variable or moderate (&amp;lt; 500k posts/month)&lt;/li&gt;
&lt;li&gt;You want to outsource the cat-and-mouse with RedNote's anti-bot updates&lt;/li&gt;
&lt;li&gt;You're prototyping and want to validate the approach before committing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The middle ground that often makes sense: use a hosted actor for production data flow, build a thin DIY layer for any specific endpoints the hosted version doesn't cover. The hosted scraper handles the maintenance burden on the parts that break most often (search, profile, posts), and you keep custom DIY logic for the edges.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you do with the data downstream
&lt;/h2&gt;

&lt;p&gt;Scraping is one third of the problem. The other two:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sentiment analysis on Chinese text.&lt;/strong&gt; Off-the-shelf Chinese BERT models (like &lt;code&gt;bert-base-chinese&lt;/code&gt; from Huggingface) are a starting point but accuracy varies wildly by domain. RedNote slang, in particular, doesn't appear in the training data of general Chinese sentiment models — fine-tuning on RedNote-specific labeled samples gets you significant accuracy lift if accuracy matters.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Image text extraction.&lt;/strong&gt; A non-trivial portion of product mentions on RedNote live in image text overlays (Chinese users frequently put product names visible in images, not in the post body). PaddleOCR is the open-source standard for Chinese OCR. Slow (~30 seconds per image) but reliable. Adds significant cost to processing pipelines but you'll miss a measurable percentage of product mentions without it.&lt;/p&gt;

&lt;p&gt;Both of these are downstream of scraping — solve scraping first, then layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is scraping RedNote legal?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Public-data scraping legality varies by jurisdiction. RedNote's ToS prohibits automated access (as do most platform ToS). The Apify approach (and most public-scraping infrastructure) treats public web pages as accessible, the same way Google's crawler would. You should consult legal counsel for your specific use case. Not legal advice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How fast can I scrape RedNote?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Realistic sustained throughput per IP is around 10-20 requests per minute before triggering rate limits. With residential proxy rotation and proper backoff, you can scale this horizontally — Apify's actor handles this internally. For DIY, plan for ~1 result per second per IP as a conservative number.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a Chinese IP?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not strictly required, but residential IPs (Asian residential preferred) have notably higher success rates than US/European residential. Datacenter IPs are blocked outright.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's &lt;code&gt;xsec_token&lt;/code&gt; and why does it matter?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When users share posts via the RedNote app or copy URLs, those URLs include an &lt;code&gt;xsec_token&lt;/code&gt; query parameter that authenticates the link request. Some scrapers don't handle URLs with &lt;code&gt;xsec_token&lt;/code&gt; correctly and return errors. If you're scraping URLs collected from real users, make sure your tooling supports this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I scrape video files from RedNote posts?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. Video URLs are returned in the post metadata. Direct download from those URLs works without authentication for public videos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How often does the request signing change?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Roughly monthly, sometimes more frequently around major RedNote app updates. If you're doing DIY, plan to dedicate 4-8 hours per rotation to update your signing function, or rely on an actively maintained library that pushes updates fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's the difference between RedNote and Xiaohongshu?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;They're the same platform. "Xiaohongshu" (小红书) is the Chinese name and means "Little Red Book". "RedNote" is the English brand they pushed during the 2024-2025 TikTok migration to be more accessible to global users. Same app, same data, same API endpoints.&lt;/p&gt;




&lt;p&gt;If you're working on China market intelligence, brand monitoring, or competitive research and want to compare notes — drop me a comment. I write about Chinese platform scraping (Weibo, Bilibili, RedNote) and the build-vs-buy trade-offs around them.&lt;/p&gt;

&lt;p&gt;The actor I mentioned: &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-xiaohongshu-scraper&lt;/a&gt;. Free tier covers ~1,000 results, which is enough to validate against your specific use case before committing.&lt;/p&gt;

</description>
      <category>python</category>
      <category>webscraping</category>
      <category>china</category>
      <category>datascience</category>
    </item>
    <item>
      <title>Monitoring the Chinese Social Media Ecosystem: RedNote, Weibo &amp; Bilibili Data Pipeline</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:56:48 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/monitoring-the-chinese-social-media-ecosystem-rednote-weibo-bilibili-data-pipeline-5dcn</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/monitoring-the-chinese-social-media-ecosystem-rednote-weibo-bilibili-data-pipeline-5dcn</guid>
      <description>&lt;p&gt;If you are doing market research, brand monitoring, competitive intelligence, or academic research on China, no single platform tells the whole story. Chinese internet users spread across specialized platforms the way Western users split between Twitter, Instagram, YouTube, and Reddit — but the Chinese platforms are larger, more fragmented, and harder to access from outside the country.&lt;/p&gt;

&lt;p&gt;This article maps the three most important Chinese social platforms for data collection, explains what each one covers, and shows how to build a unified monitoring pipeline using three Apify Actors that share a common architecture: no browser, no proxy, no API keys.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Pillars of Chinese Social Media Intelligence
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. RedNote / Xiaohongshu (小红书) — Social Commerce &amp;amp; Lifestyle
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's Instagram meets Pinterest. A social commerce platform where users share product reviews, lifestyle content, travel diaries, and beauty routines. Known for its influence on Chinese consumer purchasing decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 200M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product sentiment and unfiltered consumer reviews&lt;/li&gt;
&lt;li&gt;Trend detection in beauty, fashion, food, travel, and lifestyle&lt;/li&gt;
&lt;li&gt;Influencer (KOL/KOC) discovery for Chinese market campaigns&lt;/li&gt;
&lt;li&gt;Brand perception monitoring among Chinese consumers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Search posts, user profiles, post comments, trending topics&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Weibo (微博) — Microblogging &amp;amp; Public Opinion
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's Twitter. The platform where Chinese public opinion forms in real-time. Breaking news, celebrity drama, government communications, and brand PR all happen on Weibo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 580M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time trending topics with heat scores and categories&lt;/li&gt;
&lt;li&gt;Public opinion tracking on policy, brands, and international events&lt;/li&gt;
&lt;li&gt;Celebrity and KOL influence measurement&lt;/li&gt;
&lt;li&gt;Crisis monitoring and brand sentiment during PR events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Hot search/trending, post comments, keyword search, user posts&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auth note:&lt;/strong&gt; Trending topics and post comments work without login. Search and user posts require a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie (easily obtained from a browser session).&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Bilibili (哔哩哔哩) — Video &amp;amp; Creator Analytics
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What it is:&lt;/strong&gt; China's YouTube, with a strong focus on anime, gaming, tech education, and Gen Z culture. Known for its unique &lt;strong&gt;danmaku&lt;/strong&gt; (弹幕) system — scrolling comments that overlay the video in real-time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User base:&lt;/strong&gt; 300M+ monthly active users&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it matters for research:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creator analytics with Bilibili-specific metrics (danmaku, coins, favorites)&lt;/li&gt;
&lt;li&gt;Gaming and anime industry monitoring&lt;/li&gt;
&lt;li&gt;Chinese Gen Z content trends and preferences&lt;/li&gt;
&lt;li&gt;Tech and education content analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Actor:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modes:&lt;/strong&gt; Search videos, video details, video comments, user/creator videos, popular/trending&lt;/p&gt;




&lt;h2&gt;
  
  
  Platform Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;RedNote&lt;/th&gt;
&lt;th&gt;Weibo&lt;/th&gt;
&lt;th&gt;Bilibili&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Platform type&lt;/td&gt;
&lt;td&gt;Social commerce&lt;/td&gt;
&lt;td&gt;Microblogging&lt;/td&gt;
&lt;td&gt;Video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Primary audience&lt;/td&gt;
&lt;td&gt;Women 18-35, consumers&lt;/td&gt;
&lt;td&gt;All demographics&lt;/td&gt;
&lt;td&gt;Gen Z, gamers, students&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content format&lt;/td&gt;
&lt;td&gt;Photos + short text&lt;/td&gt;
&lt;td&gt;Short posts (tweets)&lt;/td&gt;
&lt;td&gt;Long-form video&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Product sentiment, lifestyle trends&lt;/td&gt;
&lt;td&gt;Breaking news, public opinion&lt;/td&gt;
&lt;td&gt;Creator analytics, youth culture&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Unique data&lt;/td&gt;
&lt;td&gt;Purchase intent, product reviews&lt;/td&gt;
&lt;td&gt;Hot search rankings, heat scores&lt;/td&gt;
&lt;td&gt;Danmaku counts, coin tipping&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth required&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Partial (trending: no, search: yes)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MAU&lt;/td&gt;
&lt;td&gt;200M+&lt;/td&gt;
&lt;td&gt;580M+&lt;/td&gt;
&lt;td&gt;300M+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Actor&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Building a Unified Pipeline
&lt;/h2&gt;

&lt;p&gt;All three actors share common design principles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pure HTTP&lt;/strong&gt; — no browser needed, minimal compute&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;256MB RAM&lt;/strong&gt; — cheap to run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pay-per-event&lt;/strong&gt; — $5 per 1,000 items for all three&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON/CSV/Excel output&lt;/strong&gt; — same export formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Apify integrations&lt;/strong&gt; — Google Sheets, S3, webhooks, Zapier, Make, n8n&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example: Daily China Brand Monitor
&lt;/h3&gt;

&lt;p&gt;Here is a practical pipeline architecture using all three actors:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Schedule daily runs for each actor via Apify Schedules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;RedNote&lt;/strong&gt; — search for your brand name in Chinese:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"你的品牌名"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Weibo&lt;/strong&gt; — monitor trending mentions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot_search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Bilibili&lt;/strong&gt; — track video mentions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"你的品牌名"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Use Apify webhooks to push results to your data warehouse (S3 → Snowflake/BigQuery) or directly to Google Sheets for a lightweight dashboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Set up alerting — use Zapier or Make to send Slack notifications when specific keywords appear in trending topics or when engagement spikes on brand-related content.&lt;/p&gt;




&lt;h2&gt;
  
  
  Use Case Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Brand Entering the Chinese Market
&lt;/h3&gt;

&lt;p&gt;A Western consumer brand preparing to launch in China runs all three actors to understand:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; What do Chinese consumers say about the product category? What do they value? Which competitor products are reviewed positively?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Is the brand already discussed in Chinese media? What is the general sentiment?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Are there video reviews or unboxing content for the brand or competitors?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Academic Research on Chinese Digital Culture
&lt;/h3&gt;

&lt;p&gt;A researcher studying Chinese Gen Z media consumption uses:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Trending video content, danmaku engagement patterns, creator growth trajectories&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Public discourse topics, hashtag trends over time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; Lifestyle and consumption trends, product preference signals&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  PR Crisis Monitoring
&lt;/h3&gt;

&lt;p&gt;A multinational company monitors all three platforms for brand mentions after a negative news cycle:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; Real-time trending topics and comment sentiment (the fastest-moving platform)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; Consumer reaction and product boycott signals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; Video commentary and creator opinion pieces&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Technical Architecture
&lt;/h2&gt;

&lt;p&gt;All three actors share:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Detail&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Runtime&lt;/td&gt;
&lt;td&gt;Pure HTTP (no Playwright/Puppeteer)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory&lt;/td&gt;
&lt;td&gt;256MB RAM&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Proxy&lt;/td&gt;
&lt;td&gt;Not required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API keys&lt;/td&gt;
&lt;td&gt;Not required&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;JSON, CSV, XLSX, JSONL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rate limiting&lt;/td&gt;
&lt;td&gt;Built-in backoff and retry&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Language&lt;/td&gt;
&lt;td&gt;Content in original Simplified Chinese&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Translation&lt;/td&gt;
&lt;td&gt;Pipe through Google Translate, DeepL, or Claude for English&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;All three actors use the same pricing model:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost per actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 items&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000 items&lt;/td&gt;
&lt;td&gt;$500&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Running all three actors daily with 50 items each = 150 items/day × 30 days = 4,500 items/month ≈ &lt;strong&gt;$22.50/month&lt;/strong&gt; for comprehensive daily China monitoring across three platforms.&lt;/p&gt;

&lt;p&gt;Apify's free plan includes $5 of monthly credits — enough to test all three actors.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Do I need a VPN to scrape Chinese platforms?&lt;/strong&gt;&lt;br&gt;
No. All three actors use public HTTP endpoints that are globally accessible. No VPN or proxy is required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the data in Chinese?&lt;/strong&gt;&lt;br&gt;
Yes. All content is returned in the original Simplified Chinese. For English translations, pipe the output through a translation service.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I combine data from all three actors?&lt;/strong&gt;&lt;br&gt;
Yes. All three output JSON with similar schemas. Use Apify's dataset API or export to a shared warehouse for unified analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Chinese social media legal?&lt;/strong&gt;&lt;br&gt;
These actors only access publicly available data through public web endpoints. No authentication is bypassed and no private data is accessed. Always review your local laws and each platform's terms of service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RedNote:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/rednote-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weibo:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/weibo-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bilibili:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;apify.com/zhorex/bilibili-scraper&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Start with one platform, validate the data quality, then expand to all three for comprehensive Chinese social media intelligence.&lt;/p&gt;

&lt;p&gt;Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt; — the only developer on Apify specializing in Chinese platform intelligence.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>china</category>
      <category>api</category>
      <category>marketing</category>
    </item>
    <item>
      <title>Automating Perplexity AI Searches for Content Research and Brand Monitoring</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:55:14 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/automating-perplexity-ai-searches-for-content-research-and-brand-monitoring-27o5</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/automating-perplexity-ai-searches-for-content-research-and-brand-monitoring-27o5</guid>
      <description>&lt;p&gt;Perplexity AI has become one of the most popular AI-powered search engines, giving users synthesized answers with cited sources instead of a list of blue links. For marketers, content strategists, and SEO professionals, a new discipline has emerged: &lt;strong&gt;Answer Engine Optimization (AEO)&lt;/strong&gt; — the practice of getting your brand mentioned and cited in AI-generated search results.&lt;/p&gt;

&lt;p&gt;The challenge is that monitoring how Perplexity answers queries about your brand, competitors, or industry requires manually searching one query at a time. If you want to track visibility across dozens or hundreds of queries on a regular schedule, you need automation.&lt;/p&gt;

&lt;p&gt;This article walks through how to automate Perplexity AI searches at scale using the &lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/perplexity-ai-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no Perplexity API key required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract AI-generated answers, cited sources, and related questions from Perplexity AI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/perplexity-ai-scraper&lt;/code&gt;&lt;/a&gt; on Apify&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $0.02 per query ($20 per 1,000 queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No API key:&lt;/strong&gt; Scrapes the public web interface directly via headless browser&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two modes:&lt;/strong&gt; Full search results, or brand mention monitoring&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Automate Perplexity Searches?
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Answer Engine Optimization (AEO)
&lt;/h3&gt;

&lt;p&gt;As more users shift from Google to AI search engines like Perplexity, ChatGPT, and Google AI Overviews, monitoring your visibility in AI-generated answers is becoming critical. AEO focuses on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Is your brand mentioned&lt;/strong&gt; when users ask about your product category?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What position&lt;/strong&gt; does your brand appear in the AI answer?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Which competitors&lt;/strong&gt; are mentioned alongside you?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What sources&lt;/strong&gt; does Perplexity cite — and is your content among them?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Content Strategy
&lt;/h3&gt;

&lt;p&gt;Perplexity curates answers from multiple sources and cites them. By analyzing which URLs get cited for your target keywords, you can identify content gaps: if your competitors are cited but you are not, that signals where to invest in content.&lt;/p&gt;

&lt;h3&gt;
  
  
  Competitive Intelligence
&lt;/h3&gt;

&lt;p&gt;Track how Perplexity positions your competitors for your target keywords. Which brands does it recommend? How does it describe them? This gives you a direct view into how AI search engines perceive your market.&lt;/p&gt;




&lt;h2&gt;
  
  
  Two Modes of Operation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mode 1: Search — Extract Full AI Answers
&lt;/h3&gt;

&lt;p&gt;Submit any query and get the full AI-generated answer with all cited sources.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"queries"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software for small business 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"top project management tools for startups"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software for small business 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"For 2026, the best CRM overall is..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"position"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Best CRM Software for 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.pcmag.com/picks/the-best-crm-software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"domain"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pcmag.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"snippet"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Customers are vital to any business..."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"relatedQuestions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Zoho CRM vs Salesforce which is better for small businesses"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"Best free or low-cost CRM alternatives for startups 2026"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"totalSources"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"answerLength"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1542&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.perplexity.ai/search?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-09T15:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Mode 2: Brand Monitor — Track Brand Mentions in AI Answers
&lt;/h3&gt;

&lt;p&gt;Automatically detect whether your brand appears in Perplexity's answer for a set of queries, and see which competitors are mentioned.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Input:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"brand_monitor"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brandName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"queries"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"top email marketing platforms for ecommerce"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"brand"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"query"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"best CRM software 2026"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mentioned"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mentionContext"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot is the easiest all-around option for teams that want a simple start..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"position"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Mentioned early in the answer (top section)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"competitorsMentioned"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Zoho CRM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Salesforce"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Pipedrive"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sourcesCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-09T15:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Practical Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Weekly AEO Dashboard
&lt;/h3&gt;

&lt;p&gt;Schedule the actor to run weekly with your core keyword list (e.g., "best [your category] software 2026"). Track over time whether your brand is being recommended, and whether your visibility is improving after content investments.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Competitor Monitoring
&lt;/h3&gt;

&lt;p&gt;Run brand_monitor mode with your top competitor names. See which queries they appear for that you do not — those are opportunities.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Content Gap Analysis
&lt;/h3&gt;

&lt;p&gt;For your target queries, look at which sources Perplexity cites. If your content is not cited, study what the cited sources cover that you do not. Use this to prioritize content creation.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Market Research at Scale
&lt;/h3&gt;

&lt;p&gt;Need AI-curated answers to dozens of research questions? Use search mode with a list of queries and get structured, sourced answers in bulk — with full attribution for verification.&lt;/p&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Integration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/perplexity-ai-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;brand_monitor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;brandName&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YourBrand&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;best project management tools 2026&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mentioned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;competitorsMentioned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/perplexity-ai-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;queries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;best CRM software 2026&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;answer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How It Works Under the Hood
&lt;/h2&gt;

&lt;p&gt;The actor opens Perplexity.ai in a headless Chromium browser (Playwright), navigates to the search URL for each query, waits for the AI to finish generating its streamed answer, then extracts the full answer text, cited sources with URLs, and related questions. Results are pushed to a structured Apify dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~15-30 seconds per query (AI answer generation takes time)&lt;/li&gt;
&lt;li&gt;1024 MB RAM recommended (Playwright + Chromium)&lt;/li&gt;
&lt;li&gt;Sequential queries with 5s delay to avoid rate limiting&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Perplexity API vs. This Actor
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Perplexity Sonar API&lt;/th&gt;
&lt;th&gt;This Actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API key required&lt;/td&gt;
&lt;td&gt;Yes (paid subscription)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost&lt;/td&gt;
&lt;td&gt;Sonar Pro pricing&lt;/td&gt;
&lt;td&gt;$0.02 per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Fast (direct API)&lt;/td&gt;
&lt;td&gt;~15-30s per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sources/citations&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Brand monitoring&lt;/td&gt;
&lt;td&gt;Build it yourself&lt;/td&gt;
&lt;td&gt;Built-in mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Related questions&lt;/td&gt;
&lt;td&gt;Not included&lt;/td&gt;
&lt;td&gt;Included&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The trade-off is speed: the API is faster, but this actor requires no API key or subscription.&lt;/p&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;$0.02 per query ($20 per 1,000 queries). You can test with Apify's free tier which gives you $5 of monthly usage — enough for 250 queries.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How is this different from the Perplexity API?&lt;/strong&gt;&lt;br&gt;
The official Perplexity API (Sonar) gives programmatic access but requires a paid API key. This Actor scrapes the free public web interface — same answers, same sources, no API key required. The trade-off is speed (~15-30s per query).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is AEO?&lt;/strong&gt;&lt;br&gt;
Answer Engine Optimization is the practice of getting your brand mentioned in AI search results. As more users shift to AI search engines, monitoring your visibility in AI-generated answers is becoming as important as traditional SEO.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I schedule regular monitoring runs?&lt;/strong&gt;&lt;br&gt;
Yes. Use Apify Schedules to run the actor daily, weekly, or at any interval. Combine with webhooks to send Slack or email alerts when your brand visibility changes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Are the answers always the same?&lt;/strong&gt;&lt;br&gt;
No. AI answers are non-deterministic — the same query can produce slightly different answers on different runs. This is a characteristic of AI search, not a limitation of the actor.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/perplexity-ai-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/perplexity-ai-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try a single query in search mode to see the output quality, then build out your keyword list for systematic monitoring.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>seo</category>
      <category>marketing</category>
      <category>python</category>
    </item>
    <item>
      <title>Scraping Bilibili Videos and Creators for Market Research in 2026</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:53:41 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/scraping-bilibili-videos-and-creators-for-market-research-in-2026-4fpg</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/scraping-bilibili-videos-and-creators-for-market-research-in-2026-4fpg</guid>
      <description>&lt;p&gt;Bilibili (哔哩哔哩) is China's premier video platform with 300M+ monthly active users. Think YouTube, but with a younger demographic (Gen Z and millennials), a thriving anime and gaming community, and a unique feature called &lt;strong&gt;danmaku&lt;/strong&gt; (弹幕) — real-time scrolling comments that overlay the video as it plays.&lt;/p&gt;

&lt;p&gt;For anyone doing market research on Chinese youth culture, gaming audiences, tech content, or creator economics, Bilibili is one of the richest data sources available. The problem is that there is no official public Bilibili API for international developers. Bilibili's internal APIs are undocumented, require Chinese phone verification, and change frequently.&lt;/p&gt;

&lt;p&gt;This article shows how to extract Bilibili videos, comments, creator profiles, and trending content using the &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no API key, no browser, no proxy required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract videos, comments, creator profiles, and trending content from Bilibili&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt; on Apify — pure HTTP, 256MB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $5 per 1,000 items scraped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; None required — all endpoints are public&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unique data:&lt;/strong&gt; Danmaku counts, coin counts (投币), favorites, plus standard engagement metrics&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  What Makes Bilibili Data Unique
&lt;/h2&gt;

&lt;p&gt;Unlike YouTube or TikTok, Bilibili has platform-specific metrics that this actor captures:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Danmaku count (弹幕)&lt;/strong&gt; — live scrolling comments overlaid on the video. High danmaku signals active community engagement, not just passive viewing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coin count (投币)&lt;/strong&gt; — Bilibili's tipping system where users "throw coins" at creators. A direct signal of audience appreciation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Favorite count (收藏)&lt;/strong&gt; — equivalent to "save" on other platforms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Standard metrics:&lt;/strong&gt; views, likes, shares, replies&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These metrics together give a much richer picture of content performance than views alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  Five Scraping Modes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Search Videos
&lt;/h3&gt;

&lt;p&gt;Search by keyword (Chinese or English). Supports sort and filter options.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能教程"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortOrder"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pubdate"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sort options: &lt;code&gt;totalrank&lt;/code&gt; (relevance), &lt;code&gt;click&lt;/code&gt; (most views), &lt;code&gt;pubdate&lt;/code&gt; (newest), &lt;code&gt;dm&lt;/code&gt; (most danmaku), &lt;code&gt;stow&lt;/code&gt; (most favorites), &lt;code&gt;scores&lt;/code&gt; (most comments).&lt;/p&gt;

&lt;p&gt;Duration filters: &lt;code&gt;short&lt;/code&gt; (&amp;lt;10min), `medium` (10-30min), `long` (30-60min), `verylong` (&amp;gt;60min).&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Video Details
&lt;/h3&gt;

&lt;p&gt;Full video info with all engagement metrics and tags.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video_detail"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1GJ411x7h7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"BV1xx411c7mD"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Video Comments
&lt;/h3&gt;

&lt;p&gt;Extract comments with author info and likes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video_comments"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1GJ411x7h7"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Sort options: &lt;code&gt;hot&lt;/code&gt; (top/most-liked), &lt;code&gt;time&lt;/code&gt; (newest), &lt;code&gt;likes&lt;/code&gt; (by like count).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; Bilibili throttles comment pagination from datacenter IPs, returning only top/pinned comments. Full comment pagination requires residential IPs or authenticated sessions. Other modes are unaffected.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Creator/User Videos
&lt;/h3&gt;

&lt;p&gt;Get user profile info plus their recent videos.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_videos"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"546195"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1340190821"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Find a user's &lt;code&gt;mid&lt;/code&gt; in their profile URL: &lt;code&gt;space.bilibili.com/{mid}&lt;/code&gt;. Multiple users are processed in parallel (up to 3 concurrent).&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Popular/Trending Videos
&lt;/h3&gt;

&lt;p&gt;Trending videos, filterable by Bilibili category.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"popular"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"game"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Available categories: Animation, Music, Dance, Gaming, Knowledge, Tech, Sports, Cars, Life, Food, Animal, Fashion, Entertainment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Output Example
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Video:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"video"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"bvid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"BV1YXDfBUETP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Example Video Title"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.bilibili.com/video/BV1YXDfBUETP"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;167&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"durationFormatted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2:47"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"viewCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1570113&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"likeCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;182455&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"coinCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;110535&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"favoriteCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;63471&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"shareCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;45918&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"danmakuCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;7466&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"replyCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;17276&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authorName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Creator Name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"authorMid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1340190821&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publishDate"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-08T12:00:00+00:00"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Gaming"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"anime"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"action"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"review"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T10:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;User Profile:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mid"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;546195&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"老番茄"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fans"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;20189060&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"archiveCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;652&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"verified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"profileUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://space.bilibili.com/546195"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T10:00:00+00:00"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Use Cases for Market Research
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Gaming/anime brand monitoring&lt;/strong&gt; — Track game launches and anime reactions on China's largest anime community&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content trend analysis&lt;/strong&gt; — Identify trending topics in Chinese youth culture and Gen Z interests&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creator evaluation&lt;/strong&gt; — Analyze Bilibili KOLs (Key Opinion Leaders) for partnerships and sponsorships using follower counts, engagement ratios, and content frequency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ad placement research&lt;/strong&gt; — Understand which categories and content types perform best by danmaku density, coin rates, and view-to-engagement ratios&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Academic research&lt;/strong&gt; — Study Chinese digital culture, danmaku behavior, and content consumption patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Product launch monitoring&lt;/strong&gt; — Track brand mentions and competitor content in China&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Integration
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searchQuery&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;人工智能教程&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;danmakuCount&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/bilibili-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;searchQuery&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;人工智能教程&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;viewCount&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No browser&lt;/strong&gt; — pure HTTP requests to Bilibili's public APIs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No proxy&lt;/strong&gt; — Bilibili is accessible globally (some licensed content may be geo-restricted)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No API key&lt;/strong&gt; — all endpoints are public&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;256MB RAM&lt;/strong&gt; — lightweight and efficient&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent fetching&lt;/strong&gt; — video_detail: up to 5 in parallel; user_videos and video_comments: up to 3 in parallel&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;$0.005 per item scraped ($5 per 1,000 results). Start with Apify's free plan which includes $5 of monthly credits.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part of the Chinese Digital Intelligence Suite
&lt;/h2&gt;

&lt;p&gt;Bilibili covers video and creator analytics. For full China market coverage, combine with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; — Weibo microblogging, trending topics, public opinion (580M+ MAU)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt; — RedNote/Xiaohongshu social commerce, lifestyle content (200M+ MAU)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All actors: no browser, no proxy, no API keys. Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/bilibili-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Try the &lt;code&gt;popular&lt;/code&gt; mode first (no auth needed) to see what is trending on Chinese video right now.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>python</category>
      <category>analytics</category>
    </item>
    <item>
      <title>How to Scrape Weibo Without Login in 2026: The Complete Guide</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Thu, 23 Apr 2026 14:51:50 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-weibo-without-login-in-2026-the-complete-guide-4ge2</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/how-to-scrape-weibo-without-login-in-2026-the-complete-guide-4ge2</guid>
      <description>&lt;p&gt;Weibo (微博) is China's dominant microblogging platform — think Twitter meets Instagram, with 580M+ monthly active users. It is where Chinese public opinion forms, brands communicate, celebrities post, and news breaks. Government officials, industry leaders, and major brands all maintain active Weibo accounts.&lt;/p&gt;

&lt;p&gt;For anyone doing China market research, PR monitoring, influencer analysis, or geopolitical tracking, Weibo data is essential. The problem is that there is no official public Weibo API available for international developers. Weibo's developer platform requires a Chinese business license, imposes strict rate limits, and returns limited data.&lt;/p&gt;

&lt;p&gt;This guide walks through how to extract Weibo posts, trending topics, comments, and creator profiles using the &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; Actor on Apify — no API key, no browser, no VPN required.&lt;/p&gt;




&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;What:&lt;/strong&gt; Extract posts, trending topics, comments, and user profiles from Weibo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How:&lt;/strong&gt; &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; on Apify — pure HTTP, no browser needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost:&lt;/strong&gt; $5 per 1,000 items scraped&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth:&lt;/strong&gt; Trending topics and post comments work without any login. Search and user posts require a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No VPN needed:&lt;/strong&gt; All endpoints are globally accessible&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Why Weibo Data Matters
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Who&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;PR &amp;amp; Communications&lt;/td&gt;
&lt;td&gt;Track brand mentions in real-time on China's public square&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Market Research&lt;/td&gt;
&lt;td&gt;Monitor what is trending among Chinese consumers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Influencer Marketing&lt;/td&gt;
&lt;td&gt;Find and evaluate KOLs by followers, engagement, verification&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Competitive Intelligence&lt;/td&gt;
&lt;td&gt;Track Chinese competitor announcements and campaigns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Geopolitical Analysis&lt;/td&gt;
&lt;td&gt;Monitor public discourse on policy and international topics&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Journalism&lt;/td&gt;
&lt;td&gt;Access Chinese public opinion data for reporting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Academic Research&lt;/td&gt;
&lt;td&gt;Study Chinese social media behavior and trends&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Four Scraping Modes
&lt;/h2&gt;

&lt;p&gt;The actor supports four distinct modes, each targeting a different type of Weibo data:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Hot Search / Trending Topics (no login needed)
&lt;/h3&gt;

&lt;p&gt;Get the real-time pulse of the Chinese internet. Returns trending topics with heat scores and categories.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hot_search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Output:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"rank"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能最新突破"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"科技"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hotValue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2847562&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"labelName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"热"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isHot"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://s.weibo.com/weibo?q=..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"scrapedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-04-10T12:00:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Post Comments (no login needed)
&lt;/h3&gt;

&lt;p&gt;Extract comments from specific posts. Provide post IDs (mid) or detail URLs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"post_comments"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"postIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"5285773987283226"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxComments"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Search Posts (login optional)
&lt;/h3&gt;

&lt;p&gt;Search by keyword. Without cookies, returns hot timeline posts as a fallback. With cookies, searches the full index.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"searchQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"人工智能"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cookieString"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SUB=your_sub_cookie_value"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. User Posts (login needed for posts)
&lt;/h3&gt;

&lt;p&gt;Get profile info (always works without login) plus posts (requires cookies). Provide numeric user IDs or profile URLs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user_posts"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"userIds"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"1642634100"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cookieString"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"SUB=your_sub_cookie_value"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  How to Get Cookies
&lt;/h2&gt;

&lt;p&gt;For search and user posts modes, you need a Weibo &lt;code&gt;SUB&lt;/code&gt; cookie:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Open &lt;strong&gt;weibo.com&lt;/strong&gt; in your browser and log in&lt;/li&gt;
&lt;li&gt;Open DevTools (F12) → Application → Cookies → weibo.com&lt;/li&gt;
&lt;li&gt;Copy the value of the &lt;strong&gt;SUB&lt;/strong&gt; cookie&lt;/li&gt;
&lt;li&gt;Paste it in the &lt;code&gt;cookieString&lt;/code&gt; field as: &lt;code&gt;SUB=your_value_here&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The cookie typically lasts several days before expiring.&lt;/p&gt;




&lt;h2&gt;
  
  
  Python and JavaScript Examples
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Python:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;apify_client&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ApifyClient&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run_input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mode&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hot_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;maxResults&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]).&lt;/span&gt;&lt;span class="nf"&gt;iterate_items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;JavaScript:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ApifyClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;apify-client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ApifyClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;YOUR_API_TOKEN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;run&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;actor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zhorex/weibo-scraper&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hot_search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;maxResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;items&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dataset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;run&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;defaultDatasetId&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;listItems&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;item&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Technical Details
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No browser needed&lt;/strong&gt; — pure HTTP requests using httpx, runs in 256MB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No VPN needed&lt;/strong&gt; — Weibo endpoints are globally accessible&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic session&lt;/strong&gt; — visitor cookies obtained automatically via the Sina Visitor System&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate-limit handling&lt;/strong&gt; — exponential backoff on 418/429 errors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chinese text preserved&lt;/strong&gt; — all content returned as-is in original Simplified Chinese&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 items&lt;/td&gt;
&lt;td&gt;$5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 items&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000 items&lt;/td&gt;
&lt;td&gt;$500&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Each scraped item (post, comment, trending topic, or profile) counts as one result. You can start with Apify's free plan, which includes $5 of monthly credits — enough for 1,000 data points.&lt;/p&gt;




&lt;h2&gt;
  
  
  Part of the Chinese Digital Intelligence Suite
&lt;/h2&gt;

&lt;p&gt;Weibo covers microblogging and public opinion, but for comprehensive China market intelligence you need the full picture:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Platform&lt;/th&gt;
&lt;th&gt;Users&lt;/th&gt;
&lt;th&gt;What it covers&lt;/th&gt;
&lt;th&gt;Actor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Weibo&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;580M+ MAU&lt;/td&gt;
&lt;td&gt;Microblogging, trending, celebrity content&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;RedNote (Xiaohongshu)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;200M+ MAU&lt;/td&gt;
&lt;td&gt;Social commerce, lifestyle, product reviews&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/rednote-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/rednote-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Bilibili&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;300M+ MAU&lt;/td&gt;
&lt;td&gt;Video content, danmaku, creator analytics&lt;/td&gt;
&lt;td&gt;&lt;a href="https://apify.com/zhorex/bilibili-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/bilibili-scraper&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All three actors: no browser, no proxy, no API keys. Built by &lt;a href="https://apify.com/zhorex" rel="noopener noreferrer"&gt;Zhorex&lt;/a&gt; — the only developer on Apify specializing in Chinese platform intelligence.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is there a Weibo API?&lt;/strong&gt;&lt;br&gt;
There is no official public Weibo API available for international developers. Weibo's developer platform requires a Chinese business license and imposes strict rate limits. This scraper is the best alternative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need a VPN?&lt;/strong&gt;&lt;br&gt;
No. The Weibo endpoints used by this actor are globally accessible without a VPN or proxy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is the content in Chinese?&lt;/strong&gt;&lt;br&gt;
Yes. Weibo is a Chinese-language platform — all content is returned in the original Simplified Chinese. If you need English translations, pipe the output through a translation API (Google Translate, DeepL, or Claude).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Is scraping Weibo legal?&lt;/strong&gt;&lt;br&gt;
This scraper only accesses publicly available data through Weibo's public web endpoints. It does not bypass authentication or access private/locked accounts. Always review your local laws and Weibo's terms of service.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The Actor page, full input schema, and a free trial run are at:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/weibo-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Start with trending topics (no login needed) to see the data quality, then expand to search and user profiles as needed.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>python</category>
      <category>china</category>
    </item>
    <item>
      <title>Scraping G2 Reviews Without Anti-Bot Headaches: A SaaS Competitive Intelligence Pipeline With 41 Fields Per Review</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Fri, 17 Apr 2026 17:51:52 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/scraping-g2-reviews-without-kasada-headaches-a-saas-competitive-intelligence-pipeline-with-29-16gf</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/scraping-g2-reviews-without-kasada-headaches-a-saas-competitive-intelligence-pipeline-with-29-16gf</guid>
      <description>&lt;p&gt;G2 hosts verified software reviews across tens of thousands of SaaS categories. For anyone doing competitive intelligence, sales prospecting, or product research in the B2B software space, it is one of the single highest-signal public datasets on the internet. The problem is that scraping it has become almost comically painful in 2026.&lt;/p&gt;

&lt;p&gt;If you have tried hitting &lt;code&gt;g2.com&lt;/code&gt; with &lt;code&gt;requests&lt;/code&gt; lately, you already know the story. Cloudflare challenges, DataDome fingerprinting, TLS fingerprint checks, behavioral JavaScript puzzles, and invisible CAPTCHAs. Even well-configured headless browsers with residential proxies get flagged within the first dozen pages. The G2 review data is public, but getting to it at scale has turned into a cat-and-mouse game that burns engineering hours and proxy budget in equal measure.&lt;/p&gt;

&lt;p&gt;This article walks through a cleaner path: using the &lt;code&gt;zhorex/g2-reviews-scraper&lt;/code&gt; Actor on Apify to pull structured reviews, 41 fields deep, without running a single browser or rotating a single proxy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why G2 Data Is Worth the Effort
&lt;/h2&gt;

&lt;p&gt;Before getting into the scraper itself, it helps to articulate what the data is actually good for. G2 reviews are structured, long-form, and written by verified business users. That combination makes them uniquely useful for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sales intelligence.&lt;/strong&gt; A review that complains about vendor X's integration with Salesforce is a warm lead for vendor Y whose Salesforce integration is its headline feature.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Win/loss analysis.&lt;/strong&gt; Reviews of competitors often name the alternatives the reviewer evaluated. This is free narrative market research.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature gap detection.&lt;/strong&gt; Aggregating thousands of "what do you dislike" fields across a category surfaces the roadmap items customers actually care about.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Churn signals.&lt;/strong&gt; Negative sub-ratings on "support" or "ease of setup" for a specific competitor, trended over quarters, predict defection windows.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The value is there. The delivery mechanism is the bottleneck.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem With the Official Path
&lt;/h2&gt;

&lt;p&gt;G2 does offer a paid API, but it is gated behind an enterprise contract, requires a seat license on the G2 side, and typically restricts you to data about your own product and a handful of named competitors. Pulling the full set of reviews for a category like "CRM Software" or "Marketing Automation" across all vendors is not on the menu.&lt;/p&gt;

&lt;p&gt;The community workarounds are worse. DIY scrapers hit Cloudflare + DataDome within minutes. Proxy bills for rotating residential IPs can run into the hundreds or thousands per month. Every time G2 rolls a new JS challenge, your Playwright script breaks at 3 AM and you spend a Saturday fingerprinting headers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Actor Does
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;g2-reviews-scraper&lt;/code&gt; Actor bypasses all of that by calling G2's public review feed directly. No browser, no proxy rotation, no anti-bot bypass hacks. You give it a product URL or slug and it returns structured JSON.&lt;/p&gt;

&lt;p&gt;Feature set:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scrape reviews for any G2 product URL or slug&lt;/li&gt;
&lt;li&gt;41 fields per review including sub-ratings, reviewer job title, company size, industry, and verification status&lt;/li&gt;
&lt;li&gt;Pagination handled internally, up to the full review history of a product&lt;/li&gt;
&lt;li&gt;Filters for star rating, date range, and review language&lt;/li&gt;
&lt;li&gt;JSON, CSV, Excel, or JSONL output&lt;/li&gt;
&lt;li&gt;Runs on Apify infrastructure, so no local Node/Python setup required for the scraping itself&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Comparison Table
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Capability&lt;/th&gt;
&lt;th&gt;G2 Official API&lt;/th&gt;
&lt;th&gt;DIY scraper + residential proxies&lt;/th&gt;
&lt;th&gt;&lt;code&gt;zhorex/g2-reviews-scraper&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Access to competitor reviews&lt;/td&gt;
&lt;td&gt;No (own product only)&lt;/td&gt;
&lt;td&gt;Yes, if you can keep it running&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Anti-bot challenge handling&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;td&gt;Your problem, breaks weekly&lt;/td&gt;
&lt;td&gt;Handled, no browser needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Setup time&lt;/td&gt;
&lt;td&gt;Weeks (contract + provisioning)&lt;/td&gt;
&lt;td&gt;Days to weeks&lt;/td&gt;
&lt;td&gt;Minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cost at 10k reviews/month&lt;/td&gt;
&lt;td&gt;Custom enterprise quote&lt;/td&gt;
&lt;td&gt;~$300-600 proxies + eng time&lt;/td&gt;
&lt;td&gt;$50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sub-ratings included&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Usually not&lt;/td&gt;
&lt;td&gt;Yes (41 fields)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Export formats&lt;/td&gt;
&lt;td&gt;JSON via API&lt;/td&gt;
&lt;td&gt;Whatever you build&lt;/td&gt;
&lt;td&gt;JSON, CSV, XLSX, JSONL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Maintenance burden&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Input Example
&lt;/h2&gt;

&lt;p&gt;A realistic starter config for pulling reviews across three CRM competitors:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productUrls"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/salesforce-sales-cloud/reviews"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/hubspot-sales-hub/reviews"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/pipedrive/reviews"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxReviewsPerProduct"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"minRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"dateFrom"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-01-01"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"includeSubRatings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"includeReviewerProfile"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Output Example
&lt;/h2&gt;

&lt;p&gt;Here is one review item, trimmed to the fields most people care about. The full object has all 41 fields.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewId"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"g2-8421930"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productSlug"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hubspot-sales-hub"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"productName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HubSpot Sales Hub"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewTitle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Great for SMB, feels cramped above 50 reps"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"starRating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"subRatings"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"easeOfUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"qualityOfSupport"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"easeOfSetup"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"meetsRequirements"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3.5&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewLikes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Pipeline view is clean, sequences are easy to build, and the free tier got us started without procurement."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewDislikes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Once we hit 60 reps the reporting module struggled. Forecasting is weaker than Salesforce and custom objects are limited."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"recommendations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Fine for teams under 50. Above that, evaluate Salesforce or Dynamics."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"displayName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Verified User in Software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"jobTitle"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RevOps Manager"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"industry"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Computer Software"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"companySize"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"51-200 employees"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"isVerified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"publishedAt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-11-08T14:22:10Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"language"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"en"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"helpfulCount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"organic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reviewUrl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://www.g2.com/products/hubspot-sales-hub/reviews/hubspot-sales-hub-review-8421930"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fields include &lt;code&gt;reviewId&lt;/code&gt;, &lt;code&gt;productSlug&lt;/code&gt;, &lt;code&gt;productName&lt;/code&gt;, &lt;code&gt;starRating&lt;/code&gt;, five sub-ratings, &lt;code&gt;reviewLikes&lt;/code&gt;, &lt;code&gt;reviewDislikes&lt;/code&gt;, &lt;code&gt;recommendations&lt;/code&gt;, &lt;code&gt;problemsSolved&lt;/code&gt;, &lt;code&gt;benefitsRealized&lt;/code&gt;, reviewer display name, job title, industry, company size, region, validation status, &lt;code&gt;publishedAt&lt;/code&gt;, &lt;code&gt;updatedAt&lt;/code&gt;, &lt;code&gt;language&lt;/code&gt;, &lt;code&gt;helpfulCount&lt;/code&gt;, &lt;code&gt;source&lt;/code&gt;, &lt;code&gt;incentive&lt;/code&gt; (if the review was incentivized), and the canonical &lt;code&gt;reviewUrl&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Four Real Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. SaaS Sales Displacement Plays
&lt;/h3&gt;

&lt;p&gt;A sales team selling a CRM builds a nightly job that pulls all 1- and 2-star reviews for three major competitors. Each review is piped through an LLM that extracts the specific complaint and the reviewer's company. The result is a prioritized outbound list where every lead comes with a documented pain point in their own words. Open rates on personalized sequences built from real G2 complaints routinely run 2-3x generic cold outbound.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Category-Level Feature Gap Analysis
&lt;/h3&gt;

&lt;p&gt;A product manager at a marketing automation vendor scrapes every review in the "Marketing Automation" category filed in the last 12 months, thousands of reviews across dozens of products. She clusters the "dislikes" text with embeddings and counts cluster frequency per vendor. The result is a heatmap showing which features are consistent weak spots across the category (great roadmap input) and which are only weak for specific competitors (great competitive collateral).&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Churn and Renewal Risk Signals
&lt;/h3&gt;

&lt;p&gt;A customer success team at an observability vendor subscribes to a rolling scrape of their own product's reviews plus the top five competitors. Any new 1- or 2-star review mentioning an integration or feature their product covers gets routed to a Slack channel. It acts as an early-warning system for account risk and a real-time queue of switch-ready prospects.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Private Equity Due Diligence
&lt;/h3&gt;

&lt;p&gt;A PE analyst evaluating a SaaS acquisition scrapes 5 years of G2 history for the target and three comparable vendors. The trend of monthly average star rating, sub-rating deltas, and review volume growth becomes part of the investment memo. This is one of the few ways to reality-check the seller's narrative about product quality and market position.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;The Actor is priced at &lt;strong&gt;$0.005 per review&lt;/strong&gt; scraped, billed through Apify. Platform compute usage is negligible because there is no browser.&lt;/p&gt;

&lt;p&gt;Worked examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;1,000 reviews: &lt;strong&gt;$5&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;10,000 reviews (a large product's full history): &lt;strong&gt;$50&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;100,000 reviews (a full category sweep): &lt;strong&gt;$500&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;1,000,000 reviews (multi-category enterprise pull): &lt;strong&gt;$5,000&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Compare that to a DIY build where residential proxies for similar volumes can cost hundreds to thousands per month, plus engineering time to keep the anti-bot bypass alive. For most teams the break-even point is well under a week.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is scraping G2 reviews legal?&lt;/strong&gt;&lt;br&gt;
The Actor accesses G2's public Elasticsearch API — no authentication is bypassed and no terms of service are circumvented. That said, how you use and redistribute the data is on you. Do not republish full review text as your own content, and respect GDPR if you process reviewer metadata for EU subjects. Always consult legal counsel for your specific use case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Do I need proxies or an anti-bot solver?&lt;/strong&gt;&lt;br&gt;
No. The Actor uses G2's public review feed directly and does not trigger Cloudflare or DataDome. You do not need to supply proxies, browser fingerprints, or CAPTCHA solver tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How fresh is the data?&lt;/strong&gt;&lt;br&gt;
Reviews are scraped live at run time. If a review was published five minutes before your run, it will be included. For ongoing monitoring, schedule the Actor hourly or daily via Apify Schedules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the rate limit?&lt;/strong&gt;&lt;br&gt;
Practically speaking, you are limited by Apify concurrency and the Actor's internal pacing, not by G2 blocking. Expect roughly 500-1,000 reviews per minute per run. Runs can be parallelized across products.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can I get all 41 fields or is that a premium tier?&lt;/strong&gt;&lt;br&gt;
All fields are included at the flat $0.005 per review rate. There is no feature-gated premium tier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How do I export to my warehouse?&lt;/strong&gt;&lt;br&gt;
Apify exposes datasets as JSON, CSV, XLSX, JSONL, RSS, and HTML table, and has native integrations for S3, Google Drive, and webhooks. A common pattern is JSONL to S3, then &lt;code&gt;COPY&lt;/code&gt; into Snowflake or BigQuery.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pair It With Capterra for Full B2B Coverage
&lt;/h2&gt;

&lt;p&gt;G2 skews toward mid-market and enterprise SaaS. Capterra, owned by Gartner, leans more toward SMB and has broader coverage of vertical software (construction, healthcare, legal). For any serious competitive intel project, you want both. The companion Actor &lt;a href="https://apify.com/zhorex/capterra-reviews-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/capterra-reviews-scraper&lt;/code&gt;&lt;/a&gt; uses the same schema philosophy and pairs cleanly with this one in a single pipeline. If you are also tracking sentiment on Chinese-language software or consumer platforms, &lt;a href="https://apify.com/zhorex/weibo-scraper" rel="noopener noreferrer"&gt;&lt;code&gt;zhorex/weibo-scraper&lt;/code&gt;&lt;/a&gt; covers the APAC side.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The Actor page, full input schema, and a free trial run live at:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/g2-reviews-scraper" rel="noopener noreferrer"&gt;https://apify.com/zhorex/g2-reviews-scraper&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Drop in a product URL, run it, and you will have a clean JSON dataset in the Apify console in a couple of minutes. No proxy contract, no anti-bot cat-and-mouse, no maintenance bill at 3 AM.&lt;/p&gt;

</description>
      <category>webscraping</category>
      <category>api</category>
      <category>saas</category>
      <category>python</category>
    </item>
    <item>
      <title>Telegram Channel Scraper: Extract Messages, Media &amp; Metadata Without API Keys or Phone Numbers</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 15 Apr 2026 17:15:49 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/telegram-channel-scraper-extract-messages-media-metadata-without-api-keys-or-phone-numbers-39a</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/telegram-channel-scraper-extract-messages-media-metadata-without-api-keys-or-phone-numbers-39a</guid>
      <description>&lt;p&gt;&lt;strong&gt;Telegram&lt;/strong&gt; has 950+ million monthly active users and has become the go-to platform for crypto communities, news channels, research groups, and brand communications. But extracting data from &lt;strong&gt;Telegram channels&lt;/strong&gt; programmatically is surprisingly difficult — until now.&lt;/p&gt;

&lt;p&gt;This guide covers why &lt;strong&gt;Telegram scraping&lt;/strong&gt; matters, the technical challenges involved, and how to extract messages, views, reactions, and media from any &lt;strong&gt;public Telegram channel&lt;/strong&gt; without needing API keys, phone numbers, or Telegram bot tokens.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Scrape Telegram Channels?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Telegram&lt;/strong&gt; isn't just a messaging app — it's a broadcasting platform where organizations, communities, and influencers share real-time information. The data inside &lt;strong&gt;Telegram channels&lt;/strong&gt; is valuable for:&lt;/p&gt;

&lt;h3&gt;
  
  
  Business Intelligence
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Crypto &amp;amp; DeFi research&lt;/strong&gt; — Track token announcements, project updates, and community sentiment across hundreds of channels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brand monitoring&lt;/strong&gt; — See what's being said about your brand in Telegram communities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Competitive intelligence&lt;/strong&gt; — Monitor competitor announcements and marketing strategies&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Research &amp;amp; OSINT
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Open-source intelligence (OSINT)&lt;/strong&gt; — Investigate public channels for geopolitical, security, or journalistic research&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Academic research&lt;/strong&gt; — Study information spread, community dynamics, and content patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Market research&lt;/strong&gt; — Analyze consumer conversations and product discussions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Marketing &amp;amp; Content
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Influencer discovery&lt;/strong&gt; — Find active Telegram channels in your niche with engaged audiences&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content strategy&lt;/strong&gt; — Analyze which message formats, topics, and posting times drive the most views and engagement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lead generation&lt;/strong&gt; — Identify potential customers discussing relevant topics&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Challenge: Why Telegram Scraping Is Hard
&lt;/h2&gt;

&lt;p&gt;If you've tried to build a &lt;strong&gt;Telegram scraper&lt;/strong&gt;, you know the pain:&lt;/p&gt;

&lt;h3&gt;
  
  
  The Official API Requires Authentication
&lt;/h3&gt;

&lt;p&gt;Telegram's Bot API and MTProto API both require:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;phone number&lt;/strong&gt; to create an account&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;API key&lt;/strong&gt; from my.telegram.org&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session management&lt;/strong&gt; with 2FA complications&lt;/li&gt;
&lt;li&gt;Compliance with strict rate limits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For bulk data extraction, you risk &lt;strong&gt;account bans&lt;/strong&gt; if you exceed rate limits or trigger anti-abuse systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Telethon/Pyrogram Approach
&lt;/h3&gt;

&lt;p&gt;The most common DIY approach uses Python libraries like Telethon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;telethon&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TelegramClient&lt;/span&gt;

&lt;span class="c1"&gt;# Requires API ID, API hash, AND a phone number
&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TelegramClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;session&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;api_hash&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrape_channel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;channel_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;+1234567890&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Phone verification required
&lt;/span&gt;    &lt;span class="n"&gt;channel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_entity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;channel_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_messages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;channel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problems with this approach:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You need to &lt;strong&gt;provide a real phone number&lt;/strong&gt; and verify with SMS&lt;/li&gt;
&lt;li&gt;Your account can get &lt;strong&gt;banned&lt;/strong&gt; if you scrape too aggressively&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session management&lt;/strong&gt; is fragile — sessions expire, require re-auth&lt;/li&gt;
&lt;li&gt;You need to handle &lt;strong&gt;flood wait errors&lt;/strong&gt; and implement backoff logic&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Media downloads&lt;/strong&gt; require additional API calls per file&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The Easy Way: No API Keys, No Phone Numbers
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://apify.com/zhorex/telegram-channel-scraper" rel="noopener noreferrer"&gt;&lt;strong&gt;Telegram Channel Scraper&lt;/strong&gt;&lt;/a&gt; on Apify takes a completely different approach. Instead of using Telegram's authenticated API, it accesses &lt;strong&gt;public channel data&lt;/strong&gt; through Telegram's web preview endpoints — the same ones that power t.me preview pages.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Means For You
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No API key needed&lt;/strong&gt; — zero authentication required&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No phone number&lt;/strong&gt; — no account registration or SMS verification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No proxy needed&lt;/strong&gt; — direct HTTP requests work without rotation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No account ban risk&lt;/strong&gt; — doesn't use an authenticated Telegram session&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No browser needed&lt;/strong&gt; — pure HTTP requests, fast and cheap&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What Data Can You Extract?
&lt;/h3&gt;

&lt;p&gt;From any &lt;strong&gt;public Telegram channel&lt;/strong&gt;, you get:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data Field&lt;/th&gt;
&lt;th&gt;Description&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Message text&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Full message content including formatting&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Views&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Number of views per message&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reactions&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Emoji reactions with counts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Date &amp;amp; time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Publication timestamp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Media URLs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Links to photos, videos, documents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Forward info&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Original source if message was forwarded&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reply info&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Which message it replies to&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Channel metadata&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Name, description, subscriber count, avatar&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  How To Use It
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"channelNames"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"duaborev"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"crypto_news"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxMessages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"includeMedia"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Enter channel names (or full t.me URLs), set your message limit, and hit Start.&lt;/p&gt;




&lt;h2&gt;
  
  
  Real-World Use Cases
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Crypto Research Dashboard
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: List 50 crypto project Telegram channels
Step 2: Schedule daily scraping → extract latest messages
Step 3: Run NLP sentiment analysis on message text
Step 4: Build dashboard tracking project activity and community sentiment
Step 5: Alert when channels mention specific tokens or events
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  OSINT Investigation
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Identify target public channels
Step 2: Extract full message history with timestamps
Step 3: Analyze posting patterns, forwarded sources, media
Step 4: Map information networks and content origin
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Brand Monitoring
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Search for channels mentioning your brand/product
Step 2: Schedule weekly scraping of relevant channels
Step 3: Track mention frequency, sentiment, and reach (views)
Step 4: Compare share of voice vs competitors
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Content Marketing Research
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Scrape top channels in your niche
Step 2: Analyze which posts get highest views and reactions
Step 3: Identify optimal posting times and content formats
Step 4: Apply insights to your own Telegram channel strategy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Why This Scraper vs. Alternatives
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;API Key&lt;/th&gt;
&lt;th&gt;Phone #&lt;/th&gt;
&lt;th&gt;Ban Risk&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Telethon (DIY)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Slow (rate limits)&lt;/td&gt;
&lt;td&gt;Free + server costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Pyrogram (DIY)&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Slow (rate limits)&lt;/td&gt;
&lt;td&gt;Free + server costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bot API&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Free + server costs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SaaS tools&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;Varies&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;$50-200+/mo&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/telegram-channel-scraper" rel="noopener noreferrer"&gt;Telegram Channel Scraper&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;No&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;None&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Fast&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$5/1K messages&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Advantages
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Zero setup friction&lt;/strong&gt; — No API registration, no phone verification, no OAuth flow. Enter channel names and go.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No ban risk&lt;/strong&gt; — Since it doesn't use an authenticated session, there's no account to ban.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pay per result&lt;/strong&gt; — $0.005 per message ($5 per 1,000). No monthly subscription.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled runs&lt;/strong&gt; — Set up daily/weekly/hourly scraping directly in Apify.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Export anywhere&lt;/strong&gt; — JSON, CSV, Excel, XML. Integrate with Google Sheets, Zapier, Make, n8n.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Per-Message&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;100 messages&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;td&gt;$0.005&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1,000 messages&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$0.005&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 messages&lt;/td&gt;
&lt;td&gt;$50.00&lt;/td&gt;
&lt;td&gt;$0.005&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Free Apify trial includes credits to test before committing. No monthly fees.&lt;/p&gt;




&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Does this scrape private Telegram channels?
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;No.&lt;/strong&gt; This scraper only accesses &lt;strong&gt;publicly available&lt;/strong&gt; Telegram channels — the same content anyone can view at t.me/channelname without a Telegram account.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can I scrape Telegram groups (not channels)?
&lt;/h3&gt;

&lt;p&gt;Currently, the scraper is optimized for &lt;strong&gt;public channels&lt;/strong&gt; (one-way broadcast). Group chat scraping requires authenticated API access and is not supported.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does it download media files?
&lt;/h3&gt;

&lt;p&gt;It extracts &lt;strong&gt;media URLs&lt;/strong&gt; (photos, videos, documents) so you can download them separately. The URLs point to Telegram's CDN and are publicly accessible.&lt;/p&gt;

&lt;h3&gt;
  
  
  How often can I run it?
&lt;/h3&gt;

&lt;p&gt;As often as you need. Schedule runs every hour for near-real-time monitoring, or daily/weekly for trend analysis.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is scraping Telegram channels legal?
&lt;/h3&gt;

&lt;p&gt;This scraper only accesses publicly available content that anyone can view without a Telegram account. It does not bypass authentication or access private data. Always comply with local regulations when using scraped data.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started in 60 Seconds
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/telegram-channel-scraper" rel="noopener noreferrer"&gt;Go to the Telegram Channel Scraper →&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Try for free"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Enter your channel names&lt;/li&gt;
&lt;li&gt;Hit Start → get structured JSON data&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No API keys. No phone numbers. No account bans. Just clean &lt;strong&gt;Telegram channel data&lt;/strong&gt; ready for analysis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/telegram-channel-scraper" rel="noopener noreferrer"&gt;Try it free on Apify →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Part of the Zhorex scraper suite on Apify. Also check out: &lt;a href="https://apify.com/zhorex/rednote-xiaohongshu-scraper" rel="noopener noreferrer"&gt;RedNote Xiaohongshu Scraper&lt;/a&gt; · &lt;a href="https://apify.com/zhorex/kick-scraper" rel="noopener noreferrer"&gt;Kick.com Analytics&lt;/a&gt; · &lt;a href="https://apify.com/zhorex/g2-reviews-scraper" rel="noopener noreferrer"&gt;G2 Reviews Scraper&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>telegram</category>
      <category>webscraping</category>
      <category>python</category>
      <category>api</category>
    </item>
    <item>
      <title>Kick.com API Alternative: Extract Streamer Analytics, Live Streams &amp; VODs Without Authentication</title>
      <dc:creator>Sami</dc:creator>
      <pubDate>Wed, 15 Apr 2026 17:07:41 +0000</pubDate>
      <link>https://dev.to/sami_8858131362756585e4f4/kickcom-api-alternative-extract-streamer-analytics-live-streams-vods-without-authentication-4900</link>
      <guid>https://dev.to/sami_8858131362756585e4f4/kickcom-api-alternative-extract-streamer-analytics-live-streams-vods-without-authentication-4900</guid>
      <description>&lt;p&gt;Kick.com doesn't offer a public API. But if you need &lt;strong&gt;Kick streamer data&lt;/strong&gt; — profiles, live streams, VODs, clips, or channel rankings — there's a clean, structured way to get it without building your own scraper from scratch.&lt;/p&gt;

&lt;p&gt;This guide covers what data is available from &lt;strong&gt;Kick.com&lt;/strong&gt;, how the platform's internal API works, and the fastest way to extract &lt;strong&gt;Kick streaming analytics&lt;/strong&gt; at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Kick.com Data Is Valuable in 2026
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Kick.com&lt;/strong&gt; has positioned itself as the creator-friendly alternative to Twitch, offering streamers a 95/5 revenue split (vs Twitch's 50/50). This has attracted a wave of major creators, making &lt;strong&gt;Kick&lt;/strong&gt; one of the fastest-growing &lt;strong&gt;live streaming platforms&lt;/strong&gt; globally.&lt;/p&gt;

&lt;p&gt;For businesses and analysts, this growth creates demand for structured &lt;strong&gt;Kick.com data&lt;/strong&gt;:&lt;/p&gt;

&lt;h3&gt;
  
  
  Who Needs Kick Data?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Influencer marketing agencies&lt;/strong&gt; — Find the right &lt;strong&gt;Kick streamers&lt;/strong&gt; for brand deals based on viewer counts, categories, and engagement patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Esports organizations&lt;/strong&gt; — Track competitive gaming viewership and identify rising talent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brand sponsors&lt;/strong&gt; — Monitor ROI of sponsorship deals across &lt;strong&gt;Kick channels&lt;/strong&gt; in real time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content analysts&lt;/strong&gt; — Discover which categories, stream formats, and clip styles perform best&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Competitive intelligence teams&lt;/strong&gt; — Compare Kick vs Twitch vs YouTube Gaming performance for specific creators&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What Data Can You Extract?
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Data Type&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Channel profiles&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Follower count, bio, avatar, social links, verified status, creation date&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Live streams&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Current viewers, stream title, category, start time, language, tags&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;VODs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Past broadcasts with duration, views, category, thumbnail&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Clips&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Clip title, views, likes, duration, creator, video URL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rankings&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Top channels by viewer count, filtered by category&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Problem: No Public Kick API
&lt;/h2&gt;

&lt;p&gt;Unlike Twitch (which has a documented API), &lt;strong&gt;Kick.com&lt;/strong&gt; doesn't offer a public developer API. The platform does use internal API endpoints that return JSON, but:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Endpoints change without notice&lt;/strong&gt; — Kick updates their internal routes regularly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TLS fingerprinting&lt;/strong&gt; — Standard HTTP clients get blocked; you need browser-level TLS impersonation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate limiting&lt;/strong&gt; — Too many requests from the same IP get throttled&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No documentation&lt;/strong&gt; — You're reverse-engineering endpoints from browser network traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Building a DIY &lt;strong&gt;Kick scraper&lt;/strong&gt; means maintaining code that breaks every few weeks when Kick updates their internal API structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Solution: 4-in-1 Kick.com Analytics Actor
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://apify.com/zhorex/kick-scraper" rel="noopener noreferrer"&gt;&lt;strong&gt;Kick.com Streamer &amp;amp; Channel Analytics&lt;/strong&gt;&lt;/a&gt; actor on Apify solves all of this. It handles TLS fingerprinting, rate limiting, and endpoint changes — and delivers clean JSON data through 4 scraping modes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mode 1: Channel Profiles (&lt;code&gt;channel_details&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Get complete &lt;strong&gt;Kick streamer&lt;/strong&gt; profiles:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"channel_details"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"channelNames"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"xqc"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"amouranth"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"trainwreckstv"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Returns:&lt;/strong&gt; display name, bio, avatar, banner, follower count, live status, current viewers, category, stream title, verified badge, subscriber badges, social links (Instagram, Twitter, YouTube, Discord, TikTok), and creation date.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; Build a &lt;strong&gt;Kick influencer database&lt;/strong&gt; with accurate follower counts and cross-platform social links for outreach campaigns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mode 2: Live Streams (&lt;code&gt;live_streams&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Discover who's &lt;strong&gt;streaming live on Kick&lt;/strong&gt; right now:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"live_streams"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"just-chatting"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"minViewers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortBy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"viewers"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Returns:&lt;/strong&gt; channel name, viewer count, stream title, category, start time, thumbnail, language, tags, and maturity rating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; Real-time &lt;strong&gt;Kick viewership monitoring&lt;/strong&gt; — schedule runs every 5 minutes for a live dashboard of who's streaming and how many viewers they have.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mode 3: VODs &amp;amp; Clips (&lt;code&gt;channel_videos&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Extract past broadcasts and &lt;strong&gt;Kick clips&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"channel_videos"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"channelNames"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"xqc"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"videoType"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"clips"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Returns:&lt;/strong&gt; clip ID, title, duration, views, likes, category, thumbnail, video URL, creator, and creation date.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; Analyze what content formats get the most views and engagement on &lt;strong&gt;Kick&lt;/strong&gt; — identify optimal clip lengths, trending categories, and viral content patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mode 4: Channel Rankings (&lt;code&gt;top_channels&lt;/code&gt;)
&lt;/h3&gt;

&lt;p&gt;Get a ranked list of &lt;strong&gt;top Kick channels&lt;/strong&gt; by live viewers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mode"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"top_channels"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sortBy"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"viewers"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"category"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"fortnite"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"maxResults"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Returns:&lt;/strong&gt; rank, channel name, bio, avatar, current viewers, stream title, category, affiliate status, and social links.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use case:&lt;/strong&gt; &lt;strong&gt;Competitive streaming intelligence&lt;/strong&gt; — see who's dominating each category on Kick, and how viewer distribution compares to Twitch.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Actor vs. DIY Scraping
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;DIY Python Scraper&lt;/th&gt;
&lt;th&gt;&lt;a href="https://apify.com/zhorex/kick-scraper" rel="noopener noreferrer"&gt;Kick.com Analytics&lt;/a&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hours to days&lt;/td&gt;
&lt;td&gt;2 minutes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TLS fingerprinting&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Must handle yourself&lt;/td&gt;
&lt;td&gt;Built-in&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rate limit handling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Your problem&lt;/td&gt;
&lt;td&gt;Handled automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;API change updates&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;You maintain it&lt;/td&gt;
&lt;td&gt;Maintained for you&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;4 scraping modes&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Build each separately&lt;/td&gt;
&lt;td&gt;All included&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data normalization&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Manual parsing&lt;/td&gt;
&lt;td&gt;Clean JSON output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scheduling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Cron jobs + servers&lt;/td&gt;
&lt;td&gt;Built-in Apify scheduler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Server + dev time&lt;/td&gt;
&lt;td&gt;$0.005/result ($5/1K)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Key Technical Advantages
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No API key needed&lt;/strong&gt; — accesses Kick.com's public API endpoints directly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No proxy needed&lt;/strong&gt; — direct HTTP requests work without proxy rotation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No browser needed&lt;/strong&gt; — pure HTTP with TLS impersonation, no Playwright/Puppeteer overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lightweight&lt;/strong&gt; — runs on 256 MB RAM&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured output&lt;/strong&gt; — clean JSON, CSV, Excel, or XML export&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Real-World Workflow Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Influencer Marketing: Find &amp;amp; Vet Kick Streamers
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. top_channels mode → get top 100 streamers in target category
2. channel_details mode → bulk-pull profiles for all 100
3. Filter by follower count, verified status, social links
4. channel_videos (clips) → check engagement on recent content
5. Build shortlist with data-backed partnership recommendations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Esports: Track Tournament Viewership
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. live_streams mode → filter by game category (e.g., "valorant")
2. Schedule every 5 min during tournament hours
3. Export to Google Sheets via Apify integration
4. Build real-time viewership dashboard
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Content Strategy: Analyze What Works on Kick
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. channel_videos (clips) → extract top clips from 20 channels
2. Analyze: avg views by duration, category, time of day
3. Identify content patterns that drive viral clips
4. Apply insights to your own streaming or client strategy
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Cross-Platform Streaming Intelligence
&lt;/h2&gt;

&lt;p&gt;The same developer also offers a &lt;a href="https://apify.com/zhorex/twitch-scraper" rel="noopener noreferrer"&gt;&lt;strong&gt;Twitch Streamer &amp;amp; Channel Analytics&lt;/strong&gt;&lt;/a&gt; actor with 6 scraping modes. Combined with the Kick actor, you can build a complete &lt;strong&gt;cross-platform streaming analytics&lt;/strong&gt; pipeline:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compare creator performance across Kick and Twitch&lt;/li&gt;
&lt;li&gt;Track viewer migration between platforms&lt;/li&gt;
&lt;li&gt;Identify creators who are growing faster on one platform vs. another&lt;/li&gt;
&lt;li&gt;Monitor category trends across both ecosystems&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Pricing
&lt;/h2&gt;

&lt;p&gt;Pay-per-result with no monthly fees:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Volume&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1,000 results&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10,000 results&lt;/td&gt;
&lt;td&gt;$50.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;100,000 results&lt;/td&gt;
&lt;td&gt;$500.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Free Apify trial includes credits to test before committing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/kick-scraper" rel="noopener noreferrer"&gt;Go to the Kick.com Streamer &amp;amp; Channel Analytics actor →&lt;/a&gt;&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Click &lt;strong&gt;"Try for free"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Select your mode, enter parameters, hit Start&lt;/li&gt;
&lt;li&gt;Get structured JSON data in seconds&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No API key. No proxy. No browser. Just clean &lt;strong&gt;Kick.com data&lt;/strong&gt; ready for analysis, dashboards, or integration with your existing tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://apify.com/zhorex/kick-scraper" rel="noopener noreferrer"&gt;Try it free on Apify →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>streaming</category>
      <category>api</category>
      <category>python</category>
      <category>webscraping</category>
    </item>
  </channel>
</rss>
