How to Scrape RedNote (Xiaohongshu) Without Coding

#automation #socialmedia #tutorial #webscraping

If you've tried to pull data from RedNote — the English name for Xiaohongshu (小红书) — you already know it's one of the harder social platforms to scrape. There's no public API, the mobile and web apps are heavily obfuscated, and most "tutorials" stop at a curl command that breaks within a week.

This post covers why RedNote is hard to scrape, the three realistic ways to do it, and a no-code path if you don't want to maintain a scraper yourself.

Why RedNote is harder than TikTok or Instagram

A few things make Xiaohongshu a pain compared to other platforms:

Signed request headers. Every API call to edith.xiaohongshu.com needs valid x-s, x-t, and x-s-common headers. These are generated by an obfuscated JS function (window._webmsxyw) that changes periodically. Replay a captured header and you get a 461 / sign-error within minutes.
Aggressive anti-bot. Hit the same endpoint a few times from a datacenter IP and you'll get a sliding-captcha or a silent empty response. Residential proxies + pacing are basically mandatory.
No official API. Unlike YouTube or (historically) Twitter, there's no developer program. Everything is reverse-engineered from the web/app.
Fast-moving frontend. The note detail payload structure changes, fields get renamed, and noteId ↔ xsec_token coupling means you often can't fetch a note without a fresh token from the feed it appeared in. So the real problem isn't writing the first request — it's keeping it working.

Option 1 — Roll your own (most control, most maintenance)

The DIY stack usually looks like:

A headless browser (Playwright) to log in and grab the signing context, or a reverse-engineered JS signer ported to Python/Node.
A residential proxy pool with rotation.
Retry + captcha-handling logic.
A parser that survives field renames. This works, and gives you full control. The catch: you're now maintaining an anti-bot arms race. Most teams I've seen spend more time fixing the signer after a Xiaohongshu update than using the data. Fine if scraping is your product — overkill if you just need the data.

Option 2 — Generic scraping platforms (Apify, Bright Data)

Marketplaces like Apify have community "actors" for Xiaohongshu, and Bright Data sells a managed dataset/scraper. This offloads the maintenance.

Trade-offs:

Cost. Bright Data in particular gets expensive fast at volume.
Coverage gaps. Community actors break when Xiaohongshu updates and the fix depends on whoever maintains that actor.
RedNote specifically is thin. Most actors are TikTok/Instagram-first; Xiaohongshu support tends to lag. Option 3 — A managed API (no code)

If you just want clean JSON without running browsers or babysitting a signer, a managed scraping API is the no-code path. You send a profile URL or note ID, you get structured data back. Someone else eats the anti-bot maintenance.

Things to check before picking one:

Does it actually cover RedNote/Xiaohongshu? Many "social scraping APIs" advertise TikTok + Instagram and quietly omit Xiaohongshu. Test the endpoint you actually need.
Profiles, posts, and comments? Comments are where most competitor/audience analysis happens, and they're the first thing cheap APIs drop.
Output format. You want flat, predictable JSON — not a raw HTML dump you have to parse again.
Pricing model. Per-request beats per-compute-second for predictable cost. We build SpiderHubs partly to fill the RedNote gap — one API across TikTok, Instagram, YouTube, Douyin and Xiaohongshu, returning profiles, posts and comments as clean JSON, positioned as an affordable Apify / Bright Data alternative. (Disclosure: I work on it.) But the checklist above applies to whatever you pick.

A no-code workflow if you just need the data once

You don't always need an API. If it's a one-off pull:

Find the creator/topic feed you care about.
Use a managed scraper or no-code monitoring tool to pull the latest posts + engagement into a sheet/JSON.
Set it to re-run daily if you're tracking competitors over time — the daily delta is usually what you actually want, not a one-time dump. That last point is the real reason most people scrape Xiaohongshu: tracking competitors and trending content over time, not a single snapshot. Whatever route you pick, design for the recurring pull, not the first request.

SpiderHubs | 小红书·抖音·TikTok 爆款数据自动监控 SaaS

SpiderHubs 是面向内容创作者、品牌营销和数据分析师的自媒体数据监控 SaaS：每天自动爬取小红书、抖音、TikTok、YouTube、Instagram、X/Twitter 等主流平台的 Top 博主与竞品内容，支持原始视频、无水印素材、文案与评论批量导出，零账号风险。

spiderhubs.com

TL;DR

RedNote is hard because of signed headers (x-s/x-t), aggressive anti-bot, and no official API.
DIY = full control + permanent maintenance.
Apify/Bright Data = less maintenance, but cost + thin Xiaohongshu coverage.
Managed API = no code; just verify it actually covers Xiaohongshu (profiles + posts + comments) and returns clean JSON.
Whatever you choose, build for the daily recurring pull, not the one-time request.

What's your current setup for Xiaohongshu data — DIY signer, Apify, or something else? Curious what's holding up best after their recent updates.