DEV Community

Cover image for How I Built a TikTok Trend Predictor Using Python (With Code)
Olamide Olaniyan
Olamide Olaniyan

Posted on

How I Built a TikTok Trend Predictor Using Python (With Code)

Last month, I predicted 3 viral TikTok sounds before they blew up.

Not luck. Math.

I built a simple Python script that analyzes hashtag growth velocity and sound adoption rates. When something is growing faster than usual, the script flags it.

48 hours later? 2 million videos.

Here's exactly how I built it, with all the code.

The Theory Behind Trend Prediction

Viral content doesn't appear out of nowhere. It follows a pattern:

  1. Seed phase: A few creators use a sound/hashtag
  2. Early adoption: Growth accelerates (10-50% daily)
  3. Viral phase: Exponential growth (100%+ daily)
  4. Peak: Everyone's using it
  5. Decline: Oversaturation

The trick is catching content in phase 2, before phase 3 hits.

The signal? Growth velocity. Not absolute numbersโ€”rate of change.

A hashtag with 10,000 videos growing at 40% daily is more interesting than one with 1 million videos growing at 2% daily.

What We're Building

A Python script that:

  1. Tracks hashtag/sound video counts over time
  2. Calculates growth velocity (% change)
  3. Flags anything with unusual acceleration
  4. Sends alerts for potential trends

Let's build it.

Step 1: Set Up Data Collection

First, we need historical data. We'll track hashtags and sounds daily.

import requests
import json
from datetime import datetime, timedelta
import os

API_KEY = "YOUR_SOCIAVAULT_API_KEY"
BASE_URL = "https://api.sociavault.com/v1/scrape/tiktok"

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

def get_hashtag_stats(hashtag: str) -> dict:
    """Get current stats for a hashtag."""
    response = requests.get(
        f"{BASE_URL}/hashtag",
        params={"hashtag": hashtag},
        headers=headers
    )
    data = response.json()

    return {
        "hashtag": hashtag,
        "video_count": data["data"]["stats"]["videoCount"],
        "view_count": data["data"]["stats"]["viewCount"],
        "timestamp": datetime.now().isoformat()
    }

def get_sound_stats(sound_id: str) -> dict:
    """Get current stats for a sound."""
    response = requests.get(
        f"{BASE_URL}/music",
        params={"musicId": sound_id},
        headers=headers
    )
    data = response.json()

    return {
        "sound_id": sound_id,
        "title": data["data"]["title"],
        "video_count": data["data"]["stats"]["videoCount"],
        "timestamp": datetime.now().isoformat()
    }
Enter fullscreen mode Exit fullscreen mode

Step 2: Store Historical Data

We need to track changes over time. Simple JSON file works:

DATA_FILE = "trend_history.json"

def load_history() -> dict:
    """Load historical tracking data."""
    if os.path.exists(DATA_FILE):
        with open(DATA_FILE, "r") as f:
            return json.load(f)
    return {"hashtags": {}, "sounds": {}}

def save_history(data: dict):
    """Save tracking data."""
    with open(DATA_FILE, "w") as f:
        json.dump(data, f, indent=2)

def add_data_point(history: dict, item_type: str, item_id: str, stats: dict):
    """Add a new data point to history."""
    collection = history[item_type]

    if item_id not in collection:
        collection[item_id] = []

    collection[item_id].append({
        "video_count": stats["video_count"],
        "timestamp": stats["timestamp"]
    })

    # Keep last 14 days only
    collection[item_id] = collection[item_id][-14:]

    return history
Enter fullscreen mode Exit fullscreen mode

Step 3: Calculate Growth Velocity

This is where the magic happens. We calculate:

  • Daily growth rate: % change from yesterday
  • Acceleration: Is growth rate increasing?
  • Velocity score: Weighted metric for trend potential
def calculate_growth_metrics(data_points: list) -> dict:
    """Calculate growth velocity and acceleration."""
    if len(data_points) < 2:
        return {"daily_growth": 0, "acceleration": 0, "velocity_score": 0}

    # Get recent data points
    current = data_points[-1]["video_count"]
    yesterday = data_points[-2]["video_count"] if len(data_points) >= 2 else current
    two_days_ago = data_points[-3]["video_count"] if len(data_points) >= 3 else yesterday
    week_ago = data_points[-7]["video_count"] if len(data_points) >= 7 else two_days_ago

    # Calculate daily growth rate
    if yesterday > 0:
        daily_growth = ((current - yesterday) / yesterday) * 100
    else:
        daily_growth = 0

    # Calculate yesterday's growth (for acceleration)
    if two_days_ago > 0:
        yesterday_growth = ((yesterday - two_days_ago) / two_days_ago) * 100
    else:
        yesterday_growth = 0

    # Acceleration = is growth rate increasing?
    acceleration = daily_growth - yesterday_growth

    # Weekly growth for context
    if week_ago > 0:
        weekly_growth = ((current - week_ago) / week_ago) * 100
    else:
        weekly_growth = 0

    # Velocity score (weighted metric)
    # High daily growth + positive acceleration = high score
    velocity_score = (daily_growth * 0.5) + (acceleration * 0.3) + (weekly_growth * 0.2)

    return {
        "current_count": current,
        "daily_growth": round(daily_growth, 2),
        "weekly_growth": round(weekly_growth, 2),
        "acceleration": round(acceleration, 2),
        "velocity_score": round(velocity_score, 2)
    }
Enter fullscreen mode Exit fullscreen mode

Step 4: Identify Potential Trends

Now we flag items that show trend signals:

def identify_trends(history: dict, threshold: float = 25.0) -> list:
    """
    Identify potential trends based on velocity score.

    Threshold of 25 means:
    - ~20% daily growth with positive acceleration, OR
    - ~15% daily growth with strong acceleration
    """
    trends = []

    for item_type in ["hashtags", "sounds"]:
        for item_id, data_points in history[item_type].items():
            metrics = calculate_growth_metrics(data_points)

            if metrics["velocity_score"] >= threshold:
                trends.append({
                    "type": item_type,
                    "id": item_id,
                    "metrics": metrics,
                    "signal_strength": "STRONG" if metrics["velocity_score"] >= 40 else "MODERATE"
                })

    # Sort by velocity score
    trends.sort(key=lambda x: x["metrics"]["velocity_score"], reverse=True)

    return trends
Enter fullscreen mode Exit fullscreen mode

Step 5: Discover New Hashtags to Track

We need to find new hashtags, not just track known ones. Here's how:

def discover_emerging_hashtags(seed_keywords: list) -> list:
    """
    Find new hashtags by searching trending videos 
    and extracting their hashtags.
    """
    discovered = []

    for keyword in seed_keywords:
        response = requests.get(
            f"{BASE_URL}/search",
            params={"keyword": keyword, "limit": 50},
            headers=headers
        )
        videos = response.json()["data"]["videos"]

        for video in videos:
            # Extract hashtags from description
            desc = video.get("description", "")
            hashtags = [word[1:] for word in desc.split() if word.startswith("#")]

            for tag in hashtags:
                if tag not in discovered:
                    discovered.append(tag)

    return discovered

def discover_emerging_sounds(category: str = "trending") -> list:
    """Find sounds that are starting to gain traction."""
    response = requests.get(
        f"{BASE_URL}/trending",
        params={"category": category, "limit": 100},
        headers=headers
    )
    videos = response.json()["data"]["videos"]

    # Extract unique sounds
    sounds = {}
    for video in videos:
        music = video.get("music", {})
        if music.get("id"):
            sound_id = music["id"]
            if sound_id not in sounds:
                sounds[sound_id] = {
                    "id": sound_id,
                    "title": music.get("title", "Unknown"),
                    "author": music.get("author", "Unknown"),
                    "appearances": 0
                }
            sounds[sound_id]["appearances"] += 1

    # Sort by appearances in trending
    return sorted(sounds.values(), key=lambda x: x["appearances"], reverse=True)
Enter fullscreen mode Exit fullscreen mode

Step 6: Put It All Together

Here's the complete daily tracking script:

#!/usr/bin/env python3
"""
TikTok Trend Predictor
Run daily via cron to track and predict viral content.
"""

import requests
import json
from datetime import datetime
import os

# ... (include all functions from above)

def send_alert(trends: list):
    """Send alerts for detected trends (customize this)."""
    if not trends:
        print("No significant trends detected today.")
        return

    print("\n๐Ÿšจ TREND ALERT ๐Ÿšจ\n")
    print(f"Detected {len(trends)} potential trends:\n")

    for trend in trends[:10]:  # Top 10
        signal = "๐Ÿ”ฅ" if trend["signal_strength"] == "STRONG" else "๐Ÿ“ˆ"
        print(f"{signal} {trend['type'][:-1].upper()}: {trend['id']}")
        print(f"   Daily Growth: {trend['metrics']['daily_growth']}%")
        print(f"   Acceleration: {trend['metrics']['acceleration']}%")
        print(f"   Velocity Score: {trend['metrics']['velocity_score']}")
        print(f"   Current Videos: {trend['metrics']['current_count']:,}")
        print()

def main():
    print(f"๐Ÿ” TikTok Trend Predictor - {datetime.now().strftime('%Y-%m-%d %H:%M')}")
    print("=" * 50)

    # Load existing history
    history = load_history()

    # Discover new content to track
    print("\n๐Ÿ“ก Discovering new hashtags...")
    seed_keywords = ["viral", "trend", "fyp", "challenge", "dance"]
    new_hashtags = discover_emerging_hashtags(seed_keywords)[:20]

    print(f"   Found {len(new_hashtags)} hashtags to track")

    print("\n๐ŸŽต Discovering emerging sounds...")
    new_sounds = discover_emerging_sounds()[:20]
    print(f"   Found {len(new_sounds)} sounds to track")

    # Update stats for all tracked items
    print("\n๐Ÿ“Š Collecting current stats...")

    # Track hashtags
    all_hashtags = list(history["hashtags"].keys()) + new_hashtags
    all_hashtags = list(set(all_hashtags))[:50]  # Limit to 50

    for hashtag in all_hashtags:
        try:
            stats = get_hashtag_stats(hashtag)
            history = add_data_point(history, "hashtags", hashtag, stats)
            print(f"   โœ“ #{hashtag}: {stats['video_count']:,} videos")
        except Exception as e:
            print(f"   โœ— #{hashtag}: Error - {e}")

    # Track sounds
    all_sounds = list(history["sounds"].keys()) + [s["id"] for s in new_sounds]
    all_sounds = list(set(all_sounds))[:50]

    for sound_id in all_sounds:
        try:
            stats = get_sound_stats(sound_id)
            history = add_data_point(history, "sounds", sound_id, stats)
            print(f"   โœ“ Sound {sound_id}: {stats['video_count']:,} videos")
        except Exception as e:
            print(f"   โœ— Sound {sound_id}: Error - {e}")

    # Save updated history
    save_history(history)

    # Analyze for trends
    print("\n๐Ÿ”ฎ Analyzing trends...")
    trends = identify_trends(history, threshold=25.0)

    # Send alerts
    send_alert(trends)

    print("\nโœ… Done!")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Real Results

I've been running this script daily for 3 weeks. Here's what I found:

Predicted successfully:

  • A remix sound that went from 5K to 2M videos in 3 days
  • A dance challenge hashtag before it hit mainstream creators
  • A meme format that spread across niches

Velocity scores that predicted virality:

  • 25-35: Moderate growth, worth watching
  • 35-50: Strong signal, likely to trend
  • 50+: Already viral or about to explode

False positives: About 20%. Some high-velocity items plateau. That's why we track accelerationโ€”if velocity is high but acceleration is negative, the trend is peaking.

Optimizing the Algorithm

After testing, here's what improved accuracy:

1. Weight Recent Data More Heavily

def weighted_velocity_score(data_points: list) -> float:
    """
    Recent growth matters more than historical.
    """
    if len(data_points) < 3:
        return 0

    # Last 24h growth (weight: 0.5)
    day1_growth = growth_rate(data_points[-2], data_points[-1])

    # 24-48h growth (weight: 0.3)
    day2_growth = growth_rate(data_points[-3], data_points[-2]) if len(data_points) >= 3 else 0

    # 48-72h growth (weight: 0.2)
    day3_growth = growth_rate(data_points[-4], data_points[-3]) if len(data_points) >= 4 else 0

    return (day1_growth * 0.5) + (day2_growth * 0.3) + (day3_growth * 0.2)
Enter fullscreen mode Exit fullscreen mode

2. Filter Out Established Trends

def is_emerging(video_count: int, daily_growth: float) -> bool:
    """
    Filter out already-viral content.
    We want emerging trends, not established ones.
    """
    # Too big = already viral
    if video_count > 500000:
        return False

    # Too small = not enough signal
    if video_count < 1000:
        return False

    # Sweet spot: 1K-500K videos with high growth
    return daily_growth > 15.0
Enter fullscreen mode Exit fullscreen mode

3. Cross-Reference Multiple Signals

def trend_confidence(hashtag_velocity: float, sound_velocity: float, creator_adoption: float) -> str:
    """
    Higher confidence when multiple signals align.
    """
    signals = [
        hashtag_velocity > 30,
        sound_velocity > 30,
        creator_adoption > 20  # % of tracked creators using it
    ]

    true_signals = sum(signals)

    if true_signals >= 3:
        return "HIGH"
    elif true_signals >= 2:
        return "MEDIUM"
    else:
        return "LOW"
Enter fullscreen mode Exit fullscreen mode

Setting Up Automated Alerts

Run the script daily with cron:

# Run every day at 9 AM
0 9 * * * /usr/bin/python3 /path/to/trend_predictor.py >> /var/log/trends.log 2>&1
Enter fullscreen mode Exit fullscreen mode

For Slack/Discord alerts, add this:

import requests

def send_slack_alert(trends: list, webhook_url: str):
    """Send trend alerts to Slack."""
    if not trends:
        return

    blocks = [
        {
            "type": "header",
            "text": {"type": "plain_text", "text": "๐Ÿ”ฅ TikTok Trend Alert"}
        }
    ]

    for trend in trends[:5]:
        blocks.append({
            "type": "section",
            "text": {
                "type": "mrkdwn",
                "text": f"*{trend['id']}*\nGrowth: {trend['metrics']['daily_growth']}% | Score: {trend['metrics']['velocity_score']}"
            }
        })

    requests.post(webhook_url, json={"blocks": blocks})
Enter fullscreen mode Exit fullscreen mode

What's Next?

This basic predictor works surprisingly well. To improve it further:

  1. Add creator tracking: Which creators are early adopters? Their choices predict trends.
  2. Sentiment analysis: Are comments positive? Negative sentiment kills trends.
  3. Cross-platform signals: Does the trend exist on Instagram Reels? Multi-platform = bigger potential.
  4. ML model: Train on historical data to improve predictions.

Try It Yourself

Want to build your own trend predictor? Here's what you need:

  1. Data source: SociaVault's TikTok API for hashtag and sound stats
  2. Storage: SQLite or JSON for simplicity, PostgreSQL for scale
  3. Alerting: Slack, Discord, or email webhooks
  4. Patience: Run it for 2 weeks before trusting the results

The code above is production-ready. Copy it, customize the thresholds, and start predicting.


Related guides:

Top comments (0)