Introducing CommentIQ: On-Device AI for YouTube Comment Analysis

#showdev #webdev #javascript #ai

CommentIQ is a Chrome extension that reads the comments section of any YouTube video and runs sentiment analysis, topic clustering, and question extraction — entirely on your device using Gemini Nano. No API calls, no backend, nothing leaves the browser. This is a writeup of the parts that actually gave us trouble.

Getting comments without an API

YouTube killed public API access for comments years ago. That means the only way to read them is from the page itself.

YouTube is built on Polymer with custom elements — ytd-comment-renderer, ytd-comment-thread-renderer and friends. They're not plain HTML. The text you want sits inside a #content-text element nested several shadow-DOM layers deep in each renderer. Querying for it is straightforward once you know the selector; the hard part is when to query.

Comments don't exist in the DOM when the page loads. YouTube lazy-loads them as the user scrolls toward the bottom of the page. We initially tried waiting for a fixed delay after navigation — fragile, obviously — before switching to a MutationObserver watching ytd-comments:

const observer = new MutationObserver(() => {
  const threads = document.querySelectorAll('ytd-comment-thread-renderer');
  if (threads.length > 0) {
    observer.disconnect();
    extractAndAnalyze(threads);
  }
});
observer.observe(document.body, { childList: true, subtree: true });

This worked until we hit YouTube's intersection-observer-based lazy render: threads exist in the DOM but their inner content isn't populated until they scroll into view. We ended up triggering a programmatic scroll to force the first batch to render, then immediately scrolling back. Hacky, but reliable.

YouTube is a SPA and it will surprise you

The bigger navigation problem: YouTube never does a full page load between videos. It's a single-page app using the History API. Your content script loads once per tab and keeps running as the user clicks from video to video.

chrome.webNavigation.onHistoryStateUpdated fires for each video navigation, but it fires before the new page's DOM is ready. We had to debounce and re-run the observer setup on each navigation event, making sure to cancel any in-flight analysis from the previous video first.

We also ran into the extension popup losing its connection to the content script mid-session. The pattern that worked: keep a port open between popup and content script, detect onDisconnect, and show a "reload the page" nudge rather than silently failing.

Feeding Gemini Nano without hitting the context limit

Chrome's built-in Prompt API (via window.ai.languageModel) has a context window. A popular video can have thousands of comments; sending all of them at once overflows it immediately.

We first check token capacity before each request:

const session = await window.ai.languageModel.create();
const available = session.tokensLeft;

Then we batch comments into chunks that each fit comfortably under the limit, run analysis on each chunk, and merge results. For sentiment this means averaging scores across chunks. For topic clustering we run a second summarisation pass over the chunk outputs rather than the raw comments.

The model also needs to be downloaded before first use. window.ai.languageModel.capabilities() returns { available: 'readily' | 'after-download' | 'no' }. When it's after-download we show a progress indicator and poll until it flips to readily. When it's no — typically because the device doesn't meet the hardware requirements — we surface a clear message rather than a silent failure.

Structuring the output

Raw Gemini Nano output isn't JSON. We prompt for structured output explicitly and then parse it, with a fallback for when the model goes off-script:

const prompt = `
Analyze these YouTube comments and return ONLY valid JSON with this shape:
{ "sentiment": { "positive": 0-100, "neutral": 0-100, "negative": 0-100 },
  "topics": ["string", ...],
  "questions": ["string", ...],
  "ideas": ["string", ...] }

Comments:
${batch}
`;
try {
  result = JSON.parse(await session.prompt(prompt));
} catch {
  result = fallbackParse(rawOutput);
}

The fallback uses regex to pull numbers and lists out of whatever text the model returned. It covers about 95% of the cases where strict JSON parsing fails.

Shipping without a backend

Like Margin, we wanted zero server infrastructure. Everything is stored in chrome.storage.local — analysis results are cached per video ID so re-opening the extension on the same video is instant.

The only outbound request CommentIQ ever makes is to Lemon Squeezy's public License API for key validation on the Pro tier. No secret is embedded in the extension; the API is designed to be called client-side. The license check runs once at activation and the result is cached locally — subsequent launches don't hit the network at all.

For the Chrome Web Store review we had to document this data flow explicitly. The reviewers flagged the activeTab and scripting permissions as potentially broad; the response was a clear explanation in the privacy policy that we only read from ytd-comment-renderer elements on youtube.com/* and never transmit that data.

What we'd do differently

The multi-pass chunking adds latency that's noticeable on videos with dense comment sections. A streaming UI — showing results as each chunk completes rather than waiting for the full merge — would hide this better. It's on the list.

The other thing: we underestimated how much YouTube's DOM changes between A/B test variants. The comment container selector broke twice during development because YouTube was testing different layouts. A more defensive selector strategy with fallbacks would save debugging time.

CommentIQ is live on the Chrome Web Store — free to install, no account required. Questions or feedback welcome in the comments.