qcrao

Posted on May 7

What I learned squeezing the YouTube Data API v3 quota for a side project

#youtube #api #sideprojects #indie

TLDR: The default 10,000 unit/day quota will burn through in ~10 naive user requests. Three tricks pulled my per-user cost down 50× and let me ship TubeVocab on the free tier.

When I started building TubeVocab — an ESL learning tool that turns any YouTube video into a clickable, vocab-learning interactive transcript — I assumed the YouTube Data API v3 would be the cheap, easy part. "It's Google. It scales. The free tier is generous." That kind of gut feeling.

I was wrong. The free tier is generous, but only if you understand how quota math actually works. Most public tutorials skip this. Here's what I learned the hard way.

The quota arithmetic nobody puts in the quickstart

Default daily quota: 10,000 units. Sounds like a lot.

Then you start reading the cost table and realize:

search.list — 100 units per call. That's how you find a video by query.
videos.list — 1 unit per call. That's how you fetch metadata once you have an ID.
captions.list — 50 units. Thumbnails of available subtitles.
captions.download — 200 units. The actual subtitle data.

If your user-facing flow is "search a YouTube channel → pick a video → load subtitles → render the interactive player," you're looking at roughly 100 + 1 + 50 + 200 = 351 units per single user session. The 10,000 free units evaporate in 28 sessions/day.

That's not a side project. That's a 30-DAU launch and you're paying for quota expansion the next morning.

Three tricks that cut my per-user cost ~50×

1. Don't use `search.list` for known IDs

This sounds obvious in hindsight but it took me a week to see. If a user pastes a YouTube URL, the video ID is right there in the URL. Parse it. Skip search.list entirely.

// Bad: 100 units per pasted URL
const result = await youtube.search.list({ q: pastedUrl, type: 'video', part: 'snippet' });

// Good: 0 units, regex the ID
const id = pastedUrl.match(/(?:v=|youtu\.be\/)([\w-]{11})/)?.[1];
const result = await youtube.videos.list({ id, part: 'snippet,contentDetails' }); // 1 unit

This one change took the average pasted-URL flow from 351 units → 251 units.

2. Skip the official `captions.*` endpoints entirely

The captions.download endpoint costs 200 units per video AND requires OAuth (the user has to be the video owner). For non-owner subtitle access — i.e. the actual ESL use case — you need a different path.

The trick: YouTube serves the auto-generated and uploader-provided subtitles through an undocumented but stable XML endpoint that doesn't count against your quota at all. You can get the timed transcript via https://video.google.com/timedtext?lang=en&v=VIDEO_ID, parse the XML, and you're done. 0 quota units.

(Caveat: this endpoint is undocumented, so it can break. I have a fallback path that uses youtube-transcript-api style scraping. The combined approach gets ~95% subtitle hit rate without touching the official caption quota.)

After this, my "load subtitles" cost dropped from 250 → 1 unit per session.

3. Cache aggressively at the video-ID level

Every time someone watches a video on TubeVocab, the metadata + subtitle + thumbnail set is the same until the video itself changes. I run a per-video-ID cache (just SQLite — overkill is fine) with no expiry. Subsequent views of the same video cost zero quota, regardless of how many users watch it.

Once I had ~500 popular videos cached, my marginal cost per session was effectively zero. The quota is now spent only on first-time-seen videos.

What actually shipped

After these three optimizations:

Average new-video session: ~2 units (videos.list + occasional fallback)
Average cached-video session: 0 units
Daily ceiling on the free tier: ~5,000 unique new videos/day before I'd need to start budgeting

That's enough headroom for the foreseeable lifetime of a side project.

If you're building anything in the YouTube + content-analysis space — vocabulary tools, accessibility, search, analytics — the playbook is roughly: assume search.list is poison, route around captions.*, and cache by video ID forever. The free tier becomes more than generous once you stop fighting it.

For context: I built TubeVocab using exactly this stack — it's a click-to-flashcard ESL tool that turns any YouTube video into vocabulary practice. The quota math was the single most underestimated technical risk of the whole project. Hope this saves someone a week.

DEV Community

What I learned squeezing the YouTube Data API v3 quota for a side project

The quota arithmetic nobody puts in the quickstart

Three tricks that cut my per-user cost ~50×

1. Don't use `search.list` for known IDs

2. Skip the official `captions.*` endpoints entirely

3. Cache aggressively at the video-ID level

What actually shipped

Top comments (0)

The quota arithmetic nobody puts in the quickstart

Three tricks that cut my per-user cost ~50×

1. Don't use search.list for known IDs

2. Skip the official captions.* endpoints entirely

3. Cache aggressively at the video-ID level

What actually shipped

1. Don't use `search.list` for known IDs

2. Skip the official `captions.*` endpoints entirely