Sami

Posted on May 8

Weibo's Hot Search Is the Best Real-Time Feed of Chinese Public Sentiment in 2026

#webscraping #python #china #marketing

Weibo's "hot search" (热搜) is the closest thing China has to a real-time barometer of public attention. It updates every few minutes, ranks topics by an opaque heat score, and is where every news cycle, celebrity scandal, and viral product launch lands first. For brands, agencies, and researchers covering China, this feed is gold — and unlike most of Weibo, it's accessible without a single cookie.

This post is for anyone building a brand-monitoring, sentiment-tracking, or trend-discovery pipeline aimed at China.

Why hot search matters

Weibo (微博) is China's microblogging giant — 580M+ monthly active users. The hot search ranking is curated by Weibo's own engagement signals: a topic earns a spot when search volume, post creation, and engagement spike together within a short window.

That makes hot search a leading indicator for:

PR crises: a brand mention reaches the top 50 within minutes of a viral video
Product launches: launches by Apple, Tesla, Xiaomi, etc. typically hit the top 20 within an hour
Cultural shifts: holiday spikes, generational slang, viral memes
Geopolitics: state-affiliated topics surface predictably; their ranking velocity tells a story

If you're tracking China for any of these use cases, polling hot search every 5–15 minutes gives you sub-news-cycle response time.

What you actually get

Each hot search row exposes:

rank (1–50)
title (the search term itself, in Chinese)
hotValue — an integer that approximates topical heat
category (科技 = tech, 娱乐 = entertainment, 时尚 = fashion, etc.)
labelName — content-moderation labels: 热 (hot), 新 (new), 沸 (boiling), 爆 (exploding)
isHot flag
url to the search results page on weibo.com

Sample row:

{
    "rank": 1,
    "title": "人工智能最新突破",
    "category": "科技",
    "hotValue": 2847562,
    "labelName": "热",
    "isHot": true,
    "url": "https://s.weibo.com/weibo?q=%23..."
}

A minimal Python pipeline

import time
from datetime import datetime
from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

def snapshot_hot_search():
    run = client.actor("zhorex/weibo-scraper").call(run_input={
        "mode": "hot_search",
        "maxResults": 50,
    })
    return list(client.dataset(run["defaultDatasetId"]).iterate_items())

# Poll every 10 minutes and dedupe by title
seen = {}
while True:
    snap = snapshot_hot_search()
    ts = datetime.utcnow().isoformat()
    for row in snap:
        title = row["title"]
        if title not in seen or seen[title]["rank"] != row["rank"]:
            seen[title] = {"rank": row["rank"], "first_seen": ts}
            print(f"[{ts}] rank={row['rank']:>2}  {title}  heat={row['hotValue']}")
    time.sleep(600)

A small loop and you've built a brand-mention monitor.

Common patterns I see customers run

1. Brand watch. Match new hot-search titles against a list of brand keywords. Trigger alerts when a brand name enters top 50.

2. Velocity tracking. Compute the rank-change velocity per topic. Topics that jump from rank 40 → 5 in under 30 minutes are early-warning signals for going viral.

3. Category drift. Track which categories dominate hot search hour-by-hour. Useful for media planning and ad targeting timing.

4. Cross-platform correlation. Pair Weibo hot search with Bilibili trending and RedNote search to detect cross-platform memes early. The platforms are surprisingly correlated 1–6 hours apart.

Going deeper: posts and comments

Hot search gives you topics. To go deeper into actual conversation, pivot from a hot title to its underlying posts:

# After identifying a hot topic, search posts about it
posts_run = client.actor("zhorex/weibo-scraper").call(run_input={
    "mode": "search",
    "searchQuery": "人工智能最新突破",
    "maxResults": 100,
})

That returns post-level data: text, author, like/repost/comment counts, embedded images, and post URLs. Pair with mode: post_comments to harvest reactions.

Why a hosted scraper, not raw scraping

Weibo's public web endpoints work without login for most read paths, but they require a visitor session token (Sina Visitor System) and exponential backoff on throttling responses. A naive requests script will either get throttled within 100 calls or pull empty arrays without realizing.

The Weibo Scraper on Apify handles session bootstrap, throttling, retries, and consistent schema across modes (hot_search, post_comments, search, user_posts). Pure HTTP — no browser, no proxy required.

Pricing is pay-per-event: $0.005 per item. 1,000 items = $5. The free Apify tier covers 1,000 items/month.

FAQ

Is hot search censored? Some topics are rate-limited or removed by Weibo's moderation. The labelName field hints at moderation state. You'll see topics appear and disappear.

Can I get historical hot search? Not via Weibo directly — they don't expose archives. You build your own archive by snapshotting at intervals.

What about session tokens? They expire periodically. Hosted scrapers refresh them automatically; if you DIY, plan for re-auth.

Is scraping Weibo legal? This accesses publicly visible data. No authentication is bypassed. Always check your local laws and Weibo's ToS.

Building a Chinese intelligence stack?

I maintain the full suite for production pipelines:

Weibo Scraper — (this one)
Bilibili Scraper — China's YouTube, 300M MAU
RedNote (Xiaohongshu) Scraper — lifestyle social
RedNote Shop Scraper — Xiaohongshu e-commerce

Running 50K+ items per month? I offer custom output schemas, dedicated proxy pools, SLA, and volume pricing. DM me on Apify or open an Issue titled "Enterprise inquiry".

Found a bug? Open an Issue and I usually ship fixes within 48 hours.

A 30-second review on the Apify Store helps other users find this tool. ⭐

DEV Community