<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Gary Lee</title>
    <description>The latest articles on DEV Community by Gary Lee (@gary_lee_fa84612d248568a4).</description>
    <link>https://dev.to/gary_lee_fa84612d248568a4</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3959999%2Ffea002fe-2cda-4fb2-8f38-7527d10c9573.png</url>
      <title>DEV Community: Gary Lee</title>
      <link>https://dev.to/gary_lee_fa84612d248568a4</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gary_lee_fa84612d248568a4"/>
    <language>en</language>
    <item>
      <title>How to Scrape RedNote (Xiaohongshu) Without Coding</title>
      <dc:creator>Gary Lee</dc:creator>
      <pubDate>Mon, 01 Jun 2026 03:27:52 +0000</pubDate>
      <link>https://dev.to/gary_lee_fa84612d248568a4/how-to-scrape-rednote-xiaohongshu-without-coding-b37</link>
      <guid>https://dev.to/gary_lee_fa84612d248568a4/how-to-scrape-rednote-xiaohongshu-without-coding-b37</guid>
      <description>&lt;p&gt;If you've tried to pull data from RedNote — the English name for Xiaohongshu (小红书) — you already know it's one of the harder social platforms to scrape. There's no public API, the mobile and web apps are heavily obfuscated, and most "tutorials" stop at a curl command that breaks within a week.&lt;/p&gt;

&lt;p&gt;This post covers why RedNote is hard to scrape, the three realistic ways to do it, and a no-code path if you don't want to maintain a scraper yourself.&lt;/p&gt;

&lt;p&gt;Why RedNote is harder than TikTok or Instagram&lt;/p&gt;

&lt;p&gt;A few things make Xiaohongshu a pain compared to other platforms:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Signed request headers. Every API call to edith.xiaohongshu.com needs valid x-s, x-t, and x-s-common headers. These are generated by an obfuscated JS function (window._webmsxyw) that changes periodically. Replay a captured header and you get a 461 / sign-error within minutes.&lt;/li&gt;
&lt;li&gt;Aggressive anti-bot. Hit the same endpoint a few times from a datacenter IP and you'll get a sliding-captcha or a silent empty response. Residential proxies + pacing are basically mandatory.&lt;/li&gt;
&lt;li&gt;No official API. Unlike YouTube or (historically) Twitter, there's no developer program. Everything is reverse-engineered from the web/app.&lt;/li&gt;
&lt;li&gt;Fast-moving frontend. The note detail payload structure changes, fields get renamed, and noteId ↔ xsec_token coupling means you often can't fetch a note without a fresh token from the feed it appeared in.
So the real problem isn't writing the first request — it's keeping it working.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Option 1 — Roll your own (most control, most maintenance)&lt;/p&gt;

&lt;p&gt;The DIY stack usually looks like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A headless browser (Playwright) to log in and grab the signing context, or a reverse-engineered JS signer ported to Python/Node.&lt;/li&gt;
&lt;li&gt;A residential proxy pool with rotation.&lt;/li&gt;
&lt;li&gt;Retry + captcha-handling logic.&lt;/li&gt;
&lt;li&gt;A parser that survives field renames.
This works, and gives you full control. The catch: you're now maintaining an anti-bot arms race. Most teams I've seen spend more time fixing the signer after a Xiaohongshu update than using the data. Fine if scraping is your product — overkill if you just need the data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Option 2 — Generic scraping platforms (Apify, Bright Data)&lt;/p&gt;

&lt;p&gt;Marketplaces like Apify have community "actors" for Xiaohongshu, and Bright Data sells a managed dataset/scraper. This offloads the maintenance.&lt;/p&gt;

&lt;p&gt;Trade-offs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cost. Bright Data in particular gets expensive fast at volume.&lt;/li&gt;
&lt;li&gt;Coverage gaps. Community actors break when Xiaohongshu updates and the fix depends on whoever maintains that actor.&lt;/li&gt;
&lt;li&gt;RedNote specifically is thin. Most actors are TikTok/Instagram-first; Xiaohongshu support tends to lag.
Option 3 — A managed API (no code)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you just want clean JSON without running browsers or babysitting a signer, a managed scraping API is the no-code path. You send a profile URL or note ID, you get structured data back. Someone else eats the anti-bot maintenance.&lt;/p&gt;

&lt;p&gt;Things to check before picking one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Does it actually cover RedNote/Xiaohongshu? Many "social scraping APIs" advertise TikTok + Instagram and quietly omit Xiaohongshu. Test the endpoint you actually need.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Profiles, posts, and comments?&lt;/strong&gt; Comments are where most competitor/audience analysis happens, and they're the first thing cheap APIs drop.&lt;/li&gt;
&lt;li&gt;Output format. You want flat, predictable JSON — not a raw HTML dump you have to parse again.&lt;/li&gt;
&lt;li&gt;Pricing model. Per-request beats per-compute-second for predictable cost.
We build SpiderHubs partly to fill the RedNote gap — one API across TikTok, Instagram, YouTube, Douyin and Xiaohongshu, returning profiles, posts and comments as clean JSON, positioned as an affordable Apify / Bright Data alternative. (Disclosure: I work on it.) But the checklist above applies to whatever you pick.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A no-code workflow if you just need the data once&lt;/p&gt;

&lt;p&gt;You don't always need an API. If it's a one-off pull:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find the creator/topic feed you care about.&lt;/li&gt;
&lt;li&gt;Use a managed scraper or no-code monitoring tool to pull the latest posts + engagement into a sheet/JSON.&lt;/li&gt;
&lt;li&gt;Set it to re-run daily if you're tracking competitors over time — the daily delta is usually what you actually want, not a one-time dump.
That last point is the real reason most people scrape Xiaohongshu: tracking competitors and trending content over time, not a single snapshot. Whatever route you pick, design for the recurring pull, not the first request.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxttqov7hkjhvzad0ezh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fsxttqov7hkjhvzad0ezh.png" alt=" " width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://www.spiderhubs.com/" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.spiderhubs.com%2Fopengraph-image%3F3c6e1c0542bf554e" height="630" class="m-0" width="1200"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://www.spiderhubs.com/" rel="noopener noreferrer" class="c-link"&gt;
            SpiderHubs | 小红书·抖音·TikTok 爆款数据自动监控 SaaS
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            SpiderHubs 是面向内容创作者、品牌营销和数据分析师的自媒体数据监控 SaaS：每天自动爬取小红书、抖音、TikTok、YouTube、Instagram、X/Twitter 等主流平台的 Top 博主与竞品内容，支持原始视频、无水印素材、文案与评论批量导出，零账号风险。
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fwww.spiderhubs.com%2Ffavicon.ico%3Ffavicon.0b3bf435.ico" width="256" height="256"&gt;
          spiderhubs.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;p&gt;TL;DR&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RedNote is hard because of signed headers (x-s/x-t), aggressive anti-bot, and no official API.&lt;/li&gt;
&lt;li&gt;DIY = full control + permanent maintenance.&lt;/li&gt;
&lt;li&gt;Apify/Bright Data = less maintenance, but cost + thin Xiaohongshu coverage.&lt;/li&gt;
&lt;li&gt;Managed API = no code; just verify it actually covers Xiaohongshu (profiles + posts + comments) and returns clean JSON.&lt;/li&gt;
&lt;li&gt;Whatever you choose, build for the daily recurring pull, not the one-time request.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;What's your current setup for Xiaohongshu data — DIY signer, Apify, or something else? Curious what's holding up best after their recent updates.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>socialmedia</category>
      <category>tutorial</category>
      <category>webscraping</category>
    </item>
  </channel>
</rss>
