DEV Community

agenthustler
agenthustler

Posted on • Edited on

How to Scrape TikTok: Videos, Profiles, and Trending Content

TikTok's rapid growth makes it a prime target for data analysis. This guide covers practical approaches to collecting TikTok data for research and analytics.

The Challenge with TikTok

TikTok has aggressive anti-scraping measures:

  • Heavy JavaScript rendering
  • Device fingerprinting
  • Encrypted API parameters
  • Frequent anti-bot updates

Approach 1: Web Endpoint Data Extraction

TikTok's web app embeds data you can extract:

import requests, re, json, time

class TikTokScraper:
    BASE_URL = "https://www.tiktok.com"

    def __init__(self):
        self.session = requests.Session()
        self.session.headers.update({
            "User-Agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
                          "AppleWebKit/537.36 Chrome/120.0.0.0 Safari/537.36",
            "Referer": "https://www.tiktok.com/",
        })

    def get_user_info(self, username):
        url = f"{self.BASE_URL}/@{username}"
        response = self.session.get(url)
        if '__UNIVERSAL_DATA_FOR_REHYDRATION__' in response.text:
            match = re.search(
                r'<script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"[^>]*>(.*?)</script>',
                response.text
            )
            if match:
                return json.loads(match.group(1))
        return None
Enter fullscreen mode Exit fullscreen mode

Approach 2: Playwright for Dynamic Content

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Scraping Trending Content

# Implementation is proprietary (that IS the moat).
# Skip the build — use our ready-made Apify actor:
# see the CTA below for the link (fpr=yw6md3).
Enter fullscreen mode Exit fullscreen mode

Data Storage and Analysis

import csv
from datetime import datetime
from collections import Counter

def save_tiktok_data(videos, filename="tiktok_data.csv"):
    with open(filename, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=["url", "description", "scraped_at"])
        writer.writeheader()
        for video in videos:
            video["scraped_at"] = datetime.now().isoformat()
            writer.writerow(video)
    print(f"Saved {len(videos)} videos to {filename}")

def analyze_content_themes(videos):
    import re
    all_hashtags = []
    for video in videos:
        desc = video.get("description", "")
        tags = re.findall(r"#(\w+)", desc)
        all_hashtags.extend(tags)
    return Counter(all_hashtags).most_common(20)
Enter fullscreen mode Exit fullscreen mode

Handling Anti-Bot Measures

TikTok requires sophisticated proxy rotation. ScraperAPI provides JavaScript rendering with automatic proxy rotation. For residential IPs, ThorData is a solid choice.

Monitor your TikTok scraper's health with ScrapeOps — TikTok changes its defenses frequently, so you'll want immediate alerts when your scraper breaks.

Ethical Considerations

  • Only scrape publicly available content
  • Never scrape private accounts or DMs
  • Respect rate limits
  • Don't use scraped data for harassment
  • Consider using TikTok's official Research API if you qualify
  • Comply with GDPR and CCPA

Conclusion

TikTok scraping is technically challenging but possible with the right tools. Use browser automation for reliability, rotate proxies for sustainability, and always respect both the platform and its users' privacy.

Top comments (0)