DEV Community

Cover image for Scraping YouTube Channel Data in 2026: Videos, Playlists, and Metadata
agenthustler
agenthustler

Posted on

Scraping YouTube Channel Data in 2026: Videos, Playlists, and Metadata

YouTube official Data API has strict quotas: 10,000 units per day, where a single video list request costs 100 units. Thats roughly 100 API calls before youre cut off. For serious competitor research, content audits, or trend analysis, you need a different approach.

Here is how to scrape YouTube channel data at scale in 2026.

Use Cases

  • Competitor research: Track upload frequency, view counts, and engagement across rival channels
  • Content audit: Catalog your channels performance metrics for strategy planning
  • Trend analysis: Monitor trending topics and formats in your niche
  • Dataset building: Create training datasets for content recommendation models

The Cloud Approach: YouTube Scraper on Apify

The simplest way to extract YouTube data at scale is the YouTube Scraper on Apify. It handles browser rendering, anti-bot measures, and structured output.

It supports 3 modes:

1. Channel Details

Get full channel metadata including subscriber count, total views, description, and links:

input_config = {
    "mode": "channel",
    "channelUrls": [
        "https://youtube.com/@mkbhd",
        "https://youtube.com/@veritasium"
    ]
}
Enter fullscreen mode Exit fullscreen mode

2. Video List

Extract all videos from a channel with metadata:

input_config = {
    "mode": "videos",
    "channelUrl": "https://youtube.com/@fireship",
    "maxItems": 200,
    "sortBy": "newest"
}
Enter fullscreen mode Exit fullscreen mode

3. Video Search

Search across YouTube and extract results:

input_config = {
    "mode": "search",
    "query": "python web scraping tutorial 2026",
    "maxItems": 50
}
Enter fullscreen mode Exit fullscreen mode

Sample Output

{
    "videoId": "dQw4w9WgXcQ",
    "title": "Python Web Scraping - Full Course for Beginners",
    "channelName": "CodeAcademy",
    "channelUrl": "https://youtube.com/@codeacademy",
    "views": 1243567,
    "likes": 45200,
    "duration": "PT2H15M",
    "publishedAt": "2026-01-15T14:00:00Z",
    "description": "Learn web scraping with Python...",
    "thumbnailUrl": "https://i.ytimg.com/vi/example/maxresdefault.jpg",
    "tags": ["python", "scraping", "tutorial"],
    "commentCount": 892
}
Enter fullscreen mode Exit fullscreen mode

The DIY Approach: yt-dlp

yt-dlp is the swiss army knife for YouTube data extraction. It is primarily known for downloading videos, but its metadata extraction is powerful:

# Get channel metadata as JSON
yt-dlp --dump-json --flat-playlist \
    "https://youtube.com/@fireship/videos" \
    > channel_videos.jsonl

# Extract specific fields
yt-dlp --dump-json --flat-playlist \
    --print "%(title)s | %(view_count)s | %(upload_date)s" \
    "https://youtube.com/@fireship/videos"
Enter fullscreen mode Exit fullscreen mode

Processing with Python

import subprocess
import json

def get_channel_videos(channel_url, max_items=50):
    cmd = [
        "yt-dlp", "--dump-json", "--flat-playlist",
        "--playlist-end", str(max_items),
        f"{channel_url}/videos"
    ]

    result = subprocess.run(cmd, capture_output=True, text=True)
    videos = []

    for line in result.stdout.strip().split("\n"):
        if line:
            videos.append(json.loads(line))

    return videos

# Usage
videos = get_channel_videos("https://youtube.com/@veritasium", 100)
for v in videos:
    print(f"{v[title]} - {v.get(view_count, N/A)} views")
Enter fullscreen mode Exit fullscreen mode

Limitations of yt-dlp

  • Slow at scale - processes sequentially, no built-in parallelism
  • IP bans - YouTube will block your IP after heavy use
  • No cloud scaling - runs on your local machine only
  • Maintenance - YouTube changes break it regularly (though the community patches fast)

Comparing Approaches

Feature YouTube API yt-dlp Apify Actor
Quota 10K units/day Unlimited* Pay per compute
Speed Fast Slow Fast (parallel)
Setup API key + OAuth pip install Click and run
Data depth Full Full Full
Scaling Limited Manual Built-in
Cost Free (within quota) Free ~$0.50/1K videos

*Until you get IP-banned

Handling Scale

For serious extraction (10K+ videos), you need proxy rotation. YouTube is aggressive about blocking scraper IPs. Use residential proxies or a managed solution that handles rotation for you.

The cloud-based YouTube Scraper handles this automatically. It rotates IPs, manages sessions, and retries failed requests.

Quick Start: Competitor Analysis Script

import json

def compare_channels(channels_data):
    """Compare YouTube channels by key metrics."""
    for ch in channels_data:
        avg_views = sum(v["views"] for v in ch["videos"]) / len(ch["videos"])
        print(f"\n{ch[name]}:")
        print(f"  Videos: {len(ch[videos])}")
        print(f"  Avg views: {avg_views:,.0f}")
        top = max(ch["videos"], key=lambda x: x["views"])
        print(f"  Top video: {top[title]}")
Enter fullscreen mode Exit fullscreen mode

Conclusion

YouTube API quotas make it impractical for large-scale data extraction. Whether you use yt-dlp for quick local extractions or the YouTube Scraper for production pipelines, scraping gives you the flexibility the official API does not.

Start small with yt-dlp, and scale up to cloud-based solutions when you need parallelism and reliability.

Top comments (0)