Most articles about YouTube's InnerTube API show you the basics: hit /youtubei/v1/next, get comments back. Simple.
Except YouTube changed the format. If you built a scraper 6 months ago, it's probably broken now.
I spent a weekend figuring out what changed. Here's the breakdown.
The old format (RIP)
The old InnerTube response was straightforward. Comments lived inside commentRenderer objects:
{
"commentRenderer": {
"commentId": "abc123",
"contentText": { "runs": [{ "text": "Great video!" }] },
"authorText": { "simpleText": "John" },
"voteCount": { "simpleText": "42" }
}
}
Parse the onResponseReceivedEndpoints array, loop through continuationItems, extract commentRenderer. Done.
What YouTube switched to
YouTube now uses a ViewModel + mutations architecture. Comments are split across two locations in the response:
1. The ViewModel (reference only — no actual data):
{
"commentThreadRenderer": {
"commentViewModel": {
"commentViewModel": {
"commentKey": "comment_entity_abc123"
}
}
}
}
2. The mutations (actual comment data, buried deep):
{
"frameworkUpdates": {
"entityBatchUpdate": {
"mutations": [
{
"entityKey": "comment_entity_abc123",
"payload": {
"commentEntityPayload": {
"properties": {
"commentId": "abc123",
"content": { "content": "Great video!" },
"publishedTime": "2 days ago"
},
"author": {
"displayName": "John",
"channelId": "UC..."
},
"toolbar": {
"likeCountNotliked": "42",
"replyCount": "3"
}
}
}
}
]
}
}
}
The ViewModel gives you a commentKey. You match it against the entityKey in mutations to get the actual data. It's basically a foreign key join across two parts of the same JSON response.
Building the comment map
The first step is indexing all mutations into a lookup map:
function buildCommentMap(mutations) {
const map = new Map();
for (const m of mutations) {
const p = m.payload?.commentEntityPayload;
if (!p) continue;
const parseLike = (s) => {
s = String(s).replace(/,/g, '');
if (s.endsWith('K')) return Math.round(parseFloat(s) * 1000);
if (s.endsWith('M')) return Math.round(parseFloat(s) * 1000000);
return parseInt(s, 10) || 0;
};
map.set(m.entityKey, {
commentId: p.properties?.commentId || '',
text: p.properties?.content?.content || '',
author: p.author?.displayName || '',
authorChannelId: p.author?.channelId || '',
likeCount: parseLike(p.toolbar?.likeCountNotliked || '0'),
replyCount: parseLike(p.toolbar?.replyCount || '0'),
publishedTime: p.properties?.publishedTime || '',
});
}
return map;
}
One gotcha: like counts come as strings like "1.2K" or "3M", not integers. You need to parse those yourself.
Continuation tokens: two different systems
YouTube uses continuation tokens for pagination, but comments have two separate token types:
Top-level pagination — loads the next batch of comments:
// Found in continuationItemRenderer at the end of each batch
const nextToken = item.continuationItemRenderer
?.continuationEndpoint?.continuationCommand?.token;
Reply threads — loads replies under a specific comment:
// Found inside commentThreadRenderer.replies
const replyToken = cr.replies?.commentRepliesRenderer?.contents
?.find(rc => rc.continuationItemRenderer)
?.continuationItemRenderer?.continuationEndpoint
?.continuationCommand?.token;
Same /next endpoint, same request format, completely different response structure. Top-level returns commentThreadRenderer items. Reply tokens return either commentViewModel (new format) or commentRenderer (old format) — yes, replies can still use the old format even when top-level comments use the new one.
async function fetchReplies(replyToken) {
const replies = [];
let token = replyToken;
while (token) {
const data = await fetchNextBatch(token);
const mutations = data?.frameworkUpdates
?.entityBatchUpdate?.mutations || [];
const commentMap = buildCommentMap(mutations);
let nextToken = null;
const endpoints = data?.onResponseReceivedEndpoints || [];
for (const endpoint of endpoints) {
const items = endpoint?.appendContinuationItemsAction
?.continuationItems || [];
for (const item of items) {
// Old format reply
if (item.commentRenderer) {
const r = item.commentRenderer;
replies.push({
commentId: r.commentId,
text: r.contentText?.runs
?.map(x => x.text).join('') || '',
author: r.authorText?.simpleText || '',
});
}
// New format reply
if (item.commentViewModel) {
const key = item.commentViewModel?.commentKey;
const reply = key ? commentMap.get(key) : null;
if (reply) replies.push({ ...reply });
}
// Next page of replies
if (item.continuationItemRenderer) {
nextToken = item.continuationItemRenderer
?.button?.buttonRenderer?.command
?.continuationCommand?.token
|| item.continuationItemRenderer
?.continuationEndpoint
?.continuationCommand?.token;
}
}
}
token = nextToken;
}
return replies;
}
Getting the initial continuation token
Before you can paginate, you need the first token. It's not in the API response — it's embedded in the HTML page inside ytInitialData:
function parseInlineJson(html, varName) {
for (const prefix of [`var ${varName} = `, `${varName} = `]) {
let idx = html.indexOf(prefix);
if (idx === -1) continue;
idx += prefix.length;
let depth = 0;
for (let i = idx; i < html.length; i++) {
if (html[i] === '{') depth++;
else if (html[i] === '}') {
depth--;
if (depth === 0) {
return JSON.parse(html.slice(idx, i + 1));
}
}
}
}
return null;
}
// Fetch the watch page, extract ytInitialData
const html = await fetch(`https://www.youtube.com/watch?v=${videoId}`);
const initialData = parseInlineJson(html, 'ytInitialData');
The token hides in itemSectionRenderer → continuationItemRenderer → continuationEndpoint → continuationCommand → token. There are at least two places it can appear, so you need to check both:
function getCommentsContinuationToken(initialData) {
const contents = initialData?.contents
?.twoColumnWatchNextResults?.results?.results?.contents || [];
for (const content of contents) {
// Method 1: inside itemSectionRenderer contents
const inner = content?.itemSectionRenderer?.contents || [];
for (const ic of inner) {
const token = ic?.continuationItemRenderer
?.continuationEndpoint?.continuationCommand?.token;
if (token) return token;
}
// Method 2: legacy continuations array
const continuations = content?.itemSectionRenderer?.continuations;
if (continuations?.[0]) {
return continuations[0]?.nextContinuationData?.continuation
|| continuations[0]?.reloadContinuationData?.continuation;
}
}
return null;
}
The LOGIN_REQUIRED trap
Some videos return LOGIN_REQUIRED from the web page even though they're public. The fix: fall back to the ANDROID InnerTube client for metadata, while still using the web client for comments.
const ANDROID_UA = 'com.google.android.youtube/20.10.38 (Linux; U; Android 14)';
async function fetchMetadataViaAndroid(videoId) {
const response = await fetch(INNERTUBE_PLAYER_URL, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'User-Agent': ANDROID_UA,
},
body: JSON.stringify({
context: {
client: {
clientName: 'ANDROID',
clientVersion: '20.10.38',
},
},
videoId,
}),
});
return response.json();
}
Performance
With this pure HTTP approach (no browser, no Puppeteer):
- ~50,000 comments extracted in about 15 minutes
- 128MB memory — vs 4GB+ for Puppeteer-based scrapers
- No headless Chrome startup time
The bottleneck isn't your code — it's YouTube's rate limiting. Keep requests reasonable.
Wrapping up
YouTube's shift to ViewModel + mutations is likely part of a larger frontend migration. Expect more endpoints to follow this pattern. The key takeaway: always check frameworkUpdates.entityBatchUpdate.mutations — if the data isn't in the renderer anymore, it's probably there.
If you want a ready-to-use version without implementing all of this yourself, I packaged it into scrapers on Apify:
- 🎬 YouTube Comments Scraper — handles all the edge cases above
- 📝 YouTube Transcript Scraper — $0.80/1K videos, pure HTTP
- 📺 YouTube Channel Scraper
- 🎵 YouTube Shorts Scraper
All run on 128MB, no browser needed. Full toolkit here.
Top comments (0)