Olamide Olaniyan

Posted on May 9

What YouTube Comment Replies Reveal Before User Interviews Do

#ai #tutorial #javascript #webdev

Most product teams stop at top-level comments.

That is usually where the obvious reactions live.

Replies are where people explain themselves.

That is a much better place to look if you care about:

objections
feature requests
product confusion
pricing resistance
onboarding friction
competitor comparisons

I still like user interviews. I still run them. I would never tell you YouTube comments should replace talking to real customers.

But if you are trying to find language, recurring problems, and messy real-world buyer questions before you book ten calls, YouTube comment replies are a very underused source.

This post shows you how I use them for product research, how to pull them with JavaScript and Python, and when this approach is smarter than interviews, surveys, or the official API.

Why Replies Matter More Than Top-Level Comments

Top-level comments are often performative.

People praise the creator. Joke about the video. Drop a quick take. Argue with the premise.

Replies are different.

Replies are where someone says:

"I tried this and got stuck at the migration step"
"This sounds good until you have to bring in a team"
"We switched away from this because the pricing got weird"
"Does this work if you need approval flows too?"

That is product language.

Not polished landing-page language. Not survey language. Not customer-success-call language.

Real language.

That is why I keep coming back to replies.

The Research Question Comes First

Do not start by scraping thousands of replies and hoping insight appears.

Start with a question.

Examples:

What objections keep showing up around this category?
What edge cases do people care about most?
What makes buyers hesitate?
What pain points do competitor users describe in their own words?

If you have that question first, the reply threads become much easier to read.

JavaScript Version: Pull Comments, Then Pull Replies

With YouTube, the workflow is straightforward.

First fetch top-level comments for a video. Then look for comments that include a repliesContinuationToken. That token is the key to the reply thread.

const headers = {
  'X-API-Key': process.env.SOCIAVAULT_API_KEY,
};

async function fetchJson(url) {
  const response = await fetch(url, { headers });
  if (!response.ok) {
    throw new Error(`Request failed with ${response.status}`);
  }
  return response.json();
}

async function getCommentThreads(videoUrl) {
  const commentsJson = await fetchJson(
    `https://api.sociavault.com/v1/scrape/youtube/video/comments?url=${encodeURIComponent(videoUrl)}&order=top`
  );

  const comments = Object.values(commentsJson.data?.comments || {});
  const threads = [];

  for (const comment of comments) {
    const token = comment.repliesContinuationToken;
    if (!token) continue;

    const repliesJson = await fetchJson(
      `https://api.sociavault.com/v1/scrape/youtube/video/comment-replies?continuationToken=${encodeURIComponent(token)}`
    );

    threads.push({
      topLevelComment: comment.content || comment.text,
      replies: repliesJson.data?.comments || repliesJson.data || [],
    });
  }

  return threads;
}

function classifyReply(text) {
  const value = (text || '').toLowerCase();

  if (/(too expensive|pricing|cost|price)/.test(value)) return 'pricing';
  if (/(integrat|api|sync|connect)/.test(value)) return 'integration';
  if (/(confus|hard|complicated|setup)/.test(value)) return 'onboarding';
  if (/(competitor|alternative|instead of|switched from)/.test(value)) return 'competitor';
  return 'other';
}

async function researchVideo(videoUrl) {
  const threads = await getCommentThreads(videoUrl);
  const summary = {};

  for (const thread of threads) {
    for (const reply of thread.replies) {
      const text = reply.content || reply.text || '';
      const label = classifyReply(text);
      summary[label] = summary[label] || [];
      summary[label].push(text);
    }
  }

  return summary;
}

researchVideo('https://www.youtube.com/watch?v=dQw4w9WgXcQ')
  .then(summary => console.log(summary))
  .catch(error => console.error(error));

That already gives you something useful: a rough thematic breakdown of reply language for one video.

Do that across five or ten relevant videos and patterns start showing up fast.

If you want a simpler way to get the public YouTube data without managing the collection layer, SociaVault is what I would use here.

Python Version: Same Workflow, Easy to Batch

Python is great for this kind of batch research workflow.

import os
import requests


HEADERS = {'X-API-Key': os.environ['SOCIAVAULT_API_KEY']}


def fetch_json(url):
    response = requests.get(url, headers=HEADERS, timeout=30)
    response.raise_for_status()
    return response.json()


def get_comment_threads(video_url):
    comments_json = fetch_json(
        f'https://api.sociavault.com/v1/scrape/youtube/video/comments?url={video_url}&order=top'
    )

    comments = list((comments_json.get('data') or {}).get('comments', {}).values())
    threads = []

    for comment in comments:
        token = comment.get('repliesContinuationToken')
        if not token:
            continue

        replies_json = fetch_json(
            f'https://api.sociavault.com/v1/scrape/youtube/video/comment-replies?continuationToken={token}'
        )

        threads.append({
            'topLevelComment': comment.get('content') or comment.get('text'),
            'replies': replies_json.get('data', {}).get('comments') or replies_json.get('data') or [],
        })

    return threads


def classify_reply(text):
    value = (text or '').lower()

    if any(term in value for term in ['too expensive', 'pricing', 'cost', 'price']):
        return 'pricing'
    if any(term in value for term in ['integrat', 'api', 'sync', 'connect']):
        return 'integration'
    if any(term in value for term in ['confus', 'hard', 'complicated', 'setup']):
        return 'onboarding'
    if any(term in value for term in ['competitor', 'alternative', 'instead of', 'switched from']):
        return 'competitor'
    return 'other'


def research_video(video_url):
    threads = get_comment_threads(video_url)
    summary = {}

    for thread in threads:
        for reply in thread['replies']:
            text = reply.get('content') or reply.get('text') or ''
            label = classify_reply(text)
            summary.setdefault(label, []).append(text)

    return summary


summary = research_video('https://www.youtube.com/watch?v=dQw4w9WgXcQ')
for label, examples in summary.items():
    print(f'\n## {label.upper()}')
    for example in examples[:5]:
        print('-', example)

The nice thing about the Python version is that it is very easy to turn into a weekly batch job that analyzes several videos in one pass.

What To Do With the Reply Data

This is the part people often skip.

Do not just collect replies.

Turn them into something reusable.

My usual workflow is:

pick 5-10 relevant videos in the category
pull top-level comments
fetch reply threads where discussion actually happened
tag replies by theme
save exact phrases, not just summaries

That fifth step matters a lot.

You want the real wording.

If ten people say some version of:

looks useful, but this feels too complicated for a small team

that phrase is often more valuable than a neat internal note like "SMB onboarding concerns exist."

The messy language is usually the useful language.

Where This Is Better Than Interviews

This is better than user interviews when:

you need fast raw language
you are early and need hypothesis fuel
you want to study a competitor category before reaching out to users
you want public objections at scale

It is especially useful before writing:

landing page copy
pricing FAQs
comparison pages
ad hooks
onboarding messaging

Where Interviews Still Win

This is where I want to be very clear.

YouTube replies are not a replacement for talking to users.

Interviews are still much better when you need:

follow-up questions
segment-specific nuance
willingness-to-pay signals
workflow details that people do not write publicly
actual product validation

Think of reply analysis as the thing that helps you ask better interview questions.

Not the thing that replaces interviews.

Honest Alternatives

There are three realistic options here.

The official YouTube API

Good if you are already comfortable inside Google's ecosystem and only need YouTube.

Manual reading

Good for quick one-off research.

Bad if you want something repeatable or cross-video.

Public data workflow with SociaVault

Best when you want a simpler API flow and expect this research to become part of a larger stack.

That is usually where I land, because I rarely want a one-platform-only solution for very long.

Final Take

Top-level comments tell you the reaction.

Replies tell you the reasoning.

That is why I keep using them for product research, competitor analysis, copywriting, and objection mining.

If you want a practical way to turn YouTube reply threads into reusable research, SociaVault gives you the public data layer so you can spend your time on analysis instead of collection.

Start small. Pick five videos. Tag the replies. Save the exact phrases.

You will usually get better language out of that exercise than out of a week of guessing.

webdev #youtube #productivity #python #javascript

DEV Community