YouTube's official Data API v3 gives you 10,000 units per day. A single commentThreads.list request costs 1 unit — so you get 10,000 comment pages per day maximum. In practice, for any analysis at scale, you hit this limit in minutes.
There's a better way: YouTube's internal InnerTube API, which is what the YouTube website itself uses. No quota limits, no API key required.
What is InnerTube?
InnerTube is YouTube's internal JSON API. Every request your browser makes when loading YouTube — video metadata, comments, search results — goes through InnerTube endpoints at https://www.youtube.com/youtubei/v1/.
These endpoints are technically public (your browser hits them every time you watch a video), but they're undocumented and can change without notice.
Getting the required context
InnerTube requests need a context object that mimics a real browser client. You can capture this by opening YouTube in Chrome DevTools → Network → filter for youtubei and inspect any request.
The static values that work as of April 2026:
INNERTUBE_CONTEXT = {
"client": {
"clientName": "WEB",
"clientVersion": "2.20240101.01.00",
"hl": "en",
"gl": "US",
"userAgent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
}
INNERTUBE_API_KEY = "AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8"
Fetching comments for a video
import requests
import json
def get_youtube_comments(video_id: str, max_comments: int = 1000) -> list:
"""Fetch YouTube comments using InnerTube API"""
url = "https://www.youtube.com/youtubei/v1/next"
params = {"key": INNERTUBE_API_KEY}
# Initial request
payload = {
"context": INNERTUBE_CONTEXT,
"videoId": video_id,
"params": "Eg0SCDMjYEiDAhAC" # Comment sort = top comments
}
headers = {
"Content-Type": "application/json",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
}
response = requests.post(url, json=payload, params=params, headers=headers)
data = response.json()
comments = []
continuation_token = None
# Parse initial response
comments.extend(_extract_comments(data))
continuation_token = _get_continuation_token(data)
# Paginate
while continuation_token and len(comments) < max_comments:
payload = {
"context": INNERTUBE_CONTEXT,
"continuation": continuation_token
}
response = requests.post(
"https://www.youtube.com/youtubei/v1/next",
json=payload,
params=params,
headers=headers
)
data = response.json()
new_comments = _extract_comments(data)
if not new_comments:
break
comments.extend(new_comments)
continuation_token = _get_continuation_token(data)
return comments[:max_comments]
Parsing the response
InnerTube responses are deeply nested. The comment data lives in a renderer path:
def _extract_comments(data: dict) -> list:
"""Extract comment objects from InnerTube response"""
comments = []
# Navigate the response tree
tabs = data.get("engagementPanels", [])
for panel in tabs:
try:
# Comments are in the engagementPanelSectionListRenderer
items = (panel
.get("engagementPanelSectionListRenderer", {})
.get("content", {})
.get("sectionListRenderer", {})
.get("contents", []))
for item in items:
comment_thread = (item
.get("itemSectionRenderer", {})
.get("contents", []))
for c in comment_thread:
renderer = c.get("commentThreadRenderer", {})
comment = renderer.get("comment", {}).get("commentRenderer", {})
if comment:
text_runs = comment.get("contentText", {}).get("runs", [])
text = "".join(r.get("text", "") for r in text_runs)
comments.append({
"id": comment.get("commentId"),
"author": comment.get("authorText", {}).get("simpleText", ""),
"text": text,
"likes": _parse_count(comment.get("voteCount", {}).get("simpleText", "0")),
"published": comment.get("publishedTimeText", {}).get("simpleText", ""),
"is_reply": False
})
except (KeyError, TypeError):
continue
return comments
def _get_continuation_token(data: dict) -> str | None:
"""Extract continuation token for pagination"""
try:
for panel in data.get("engagementPanels", []):
items = (panel
.get("engagementPanelSectionListRenderer", {})
.get("content", {})
.get("sectionListRenderer", {})
.get("continuations", []))
for cont in items:
token = (cont
.get("nextContinuationData", {})
.get("continuation"))
if token:
return token
except (KeyError, TypeError):
pass
return None
def _parse_count(text: str) -> int:
"""Parse YouTube count strings like '1.2K', '4.5M'"""
text = text.strip().replace(",", "")
if text.endswith("K"):
return int(float(text[:-1]) * 1000)
elif text.endswith("M"):
return int(float(text[:-1]) * 1_000_000)
try:
return int(text)
except ValueError:
return 0
Usage
# Get top 500 comments for a video
comments = get_youtube_comments("dQw4w9WgXcQ", max_comments=500)
for comment in comments[:5]:
print(f"{comment['author']}: {comment['text'][:100]}")
print(f" Likes: {comment['likes']} | Posted: {comment['published']}")
print()
Alternative: Use the Apify scraper
If you don't want to maintain the InnerTube parsing logic (which breaks when YouTube updates its response format), there's a pre-built actor that handles this:
The YouTube Comment Scraper on Apify handles the InnerTube parsing, rate limiting, and rotation automatically. Input a video URL or list, get structured JSON output.
Handling rate limits
InnerTube doesn't have explicit rate limits but will start returning empty responses if you hammer it. Practical limits from testing:
- ~100 requests/minute per IP before throttling
- ~500 comment pages per session before needing a fresh session
For high-volume extraction, rotate residential IPs and add 1-2 second delays between requests.
What this gives you
The InnerTube approach returns the same data as the official API (comment text, author, likes, published date, reply counts) with no daily quota cap. For most analysis tasks — sentiment analysis, spam detection, competitor research, audience research — this is everything you need.
The tradeoff: the response structure changes periodically without notice. Budget 1-2 hours per year maintaining the parser.
Skip the maintenance overhead
If you'd rather not deal with parser maintenance, the Apify Scrapers Bundle ($29) includes a pre-built YouTube comment scraper that handles all of this automatically.
One-time purchase. Documented. Production-ready.
Top comments (0)