agenthustler

Posted on Mar 26 • Edited on Apr 19

How to Scrape TikTok in 2026: Videos, Profiles, and Hashtags

#python #webdev #tutorial #webscraping

TikTok has become one of the most data-rich platforms on the internet. Whether you're tracking trending hashtags, analyzing creator performance, or building a content aggregation tool, scraping TikTok data is a common need in 2026.

But TikTok doesn't make it easy. In this guide, I'll walk you through the challenges and show you working Python code to extract videos, profiles, and hashtag data.

Why Scraping TikTok Is Hard

TikTok's frontend is heavily JavaScript-rendered. A simple requests.get() returns a mostly empty HTML shell. The actual content loads dynamically via their internal API calls.

On top of that, TikTok employs aggressive anti-bot measures:

Browser fingerprinting — they check canvas, WebGL, and navigator properties
Rate limiting — rapid requests from the same IP get blocked fast
CAPTCHAs — automated access triggers interactive challenges
Signed API requests — internal API calls require dynamic signatures that change frequently

Approach 1: Using Playwright for Browser-Based Scraping

The most reliable way to scrape TikTok is with a real browser. Here's a working example using Playwright:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

This extracts the hydration data that TikTok embeds in the page — no need to parse the rendered DOM.

Approach 2: Intercepting TikTok's Internal API

For bulk data collection, intercepting network requests is more efficient:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

The Proxy Problem

Running these scripts from a single IP will get you blocked within minutes. TikTok's anti-bot system tracks request patterns aggressively.

You need residential proxies — IPs that look like real home internet connections. I recommend ThorData for this. Their residential proxy pool works well with TikTok because the IPs rotate automatically and come from real ISP ranges.

Here's how to add proxy support to the Playwright script:

browser = await p.chromium.launch(
    headless=True,
    proxy={
        "server": "http://proxy.thordata.com:9090",
        "username": "your-username",
        "password": "your-password"
    }
)

For a simpler approach, ScraperAPI handles the proxy rotation and browser rendering for you. Just send your URL through their endpoint and get back rendered HTML:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

Downloading TikTok Videos

Once you have video metadata, downloading the actual video files requires extracting the video URL from TikTok's CDN:

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).

The Easy Way: Pre-Built TikTok Scraper

Building and maintaining a TikTok scraper is a constant battle against their anti-bot updates. If you need reliable, production-grade scraping, I'd recommend using the TikTok Scraper on Apify. It handles all the browser rendering, proxy rotation, and anti-detection out of the box.

You can run it via the Apify API:

from apify_client import ApifyClient

client = ApifyClient("YOUR_APIFY_TOKEN")

run = client.actor("cryptosignals/tiktok-scraper").call(
    run_input={
        "profiles": ["khaby.lame", "charlidamelio"],
        "hashtags": ["python", "coding"],
        "maxVideos": 50
    }
)

for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

It handles CAPTCHAs, rotates proxies automatically, and returns clean structured data. Pay-per-use pricing means you only pay for what you scrape.

Rate Limiting Best Practices

Whether you build your own scraper or use a tool, respect these limits:

Add random delays between requests (2-5 seconds minimum)
Rotate user agents on every request
Use residential proxies — datacenter IPs get flagged instantly
Don't scrape logged-in pages — TikTok monitors authenticated sessions more closely
Cache aggressively — don't re-scrape data you already have

Conclusion

Scraping TikTok in 2026 requires a browser-based approach with solid proxy infrastructure. The Playwright examples above will get you started, but for production workloads, consider using residential proxies from ThorData or a managed scraping service like ScraperAPI.

If you want to skip the infrastructure headaches entirely, the TikTok Scraper on Apify handles everything end-to-end.

Happy scraping!

DEV Community