Kelvin

Posted on Mar 18

How to Extract YouTube Transcripts for Your AI App (No API Key Needed)

#ai #api #rag #showdev

If you're building an AI app that needs to process video content — RAG pipelines, summarizers, content analyzers — you've probably hit the same wall I did: getting transcripts from YouTube is way harder than it should be.

YouTube's official Data API requires OAuth setup, has strict quota limits (10,000 units/day), and doesn't even return transcripts directly. You'd need to chain multiple endpoints together just to get subtitle text. For a simple "give me the transcript" use case, that's overkill.

I built a lightweight API that solves this in one request.

What It Does

The YouTube Transcript API extracts transcripts from any YouTube video — no YouTube API key, no OAuth, no quota headaches. It works with both manual captions and auto-generated subtitles.

Seven endpoints, one API key:

/transcript — Timestamped segments in JSON
/transcript/text — Plain text, perfect for feeding into LLMs
/transcript/srt — Standard SRT subtitle format
/transcript/vtt — WebVTT format
/languages — List all available subtitle languages
/video-info — Video metadata (title, author, duration, views, thumbnail)
/batch — Fetch up to 5 transcripts in one request

All endpoints accept full YouTube URLs — no need to parse video IDs yourself.

Quick Start: Get a Transcript in 3 Lines

Python

import requests

response = requests.get(
    "https://youtube-transcript-api14.p.rapidapi.com/transcript/text",
    params={"videoId": "dQw4w9WgXcQ"},
    headers={
        "X-RapidAPI-Key": "YOUR_API_KEY",
        "X-RapidAPI-Host": "youtube-transcript-api14.p.rapidapi.com"
    }
)

transcript = response.text
# Feed this directly into your LLM

JavaScript

const response = await fetch(
  'https://youtube-transcript-api14.p.rapidapi.com/transcript/text?videoId=dQw4w9WgXcQ',
  {
    headers: {
      'X-RapidAPI-Key': 'YOUR_API_KEY',
      'X-RapidAPI-Host': 'youtube-transcript-api14.p.rapidapi.com'
    }
  }
);

const transcript = await response.text();

That's it. No OAuth flow, no token refresh, no quota management.

Use Case: Build a Video Q&A Bot

Here's a practical example — a script that grabs a transcript and feeds it into an LLM for summarization:

import requests

API_KEY = "YOUR_RAPIDAPI_KEY"
HEADERS = {
    "X-RapidAPI-Key": API_KEY,
    "X-RapidAPI-Host": "youtube-transcript-api14.p.rapidapi.com"
}

# Step 1: Get video info
info = requests.get(
    "https://youtube-transcript-api14.p.rapidapi.com/video-info",
    params={"videoId": "dQw4w9WgXcQ"},
    headers=HEADERS
).json()

print(f"Title: {info['title']}")
print(f"Author: {info['author']}")

# Step 2: Get plain text transcript
transcript = requests.get(
    "https://youtube-transcript-api14.p.rapidapi.com/transcript/text",
    params={"videoId": "dQw4w9WgXcQ"},
    headers=HEADERS
).text

# Step 3: Feed into your LLM
prompt = f"""Summarize this video transcript in 3 bullet points:

{transcript}"""

# Pass `prompt` to OpenAI, Claude, Gemini, or any LLM

Use Case: Batch Process a Playlist

Need transcripts from multiple videos? The /batch endpoint handles up to 5 at once:

response = requests.get(
    "https://youtube-transcript-api14.p.rapidapi.com/batch",
    params={
        "videoIds": "dQw4w9WgXcQ,jNQXAC9IVRw,9bZkp7q19f0",
        "lang": "en"
    },
    headers=HEADERS
)

results = response.json()
for result in results:
    if result.get("segments"):
        print(f"Video {result['videoId']}: {len(result['segments'])} segments")

Use Case: Download Subtitles for Video Editing

If you need subtitle files instead of raw text:

# Get SRT format
srt = requests.get(
    "https://youtube-transcript-api14.p.rapidapi.com/transcript/srt",
    params={"videoId": "dQw4w9WgXcQ"},
    headers=HEADERS
).text

with open("subtitles.srt", "w") as f:
    f.write(srt)

# Works with VLC, Premiere Pro, Final Cut, DaVinci Resolve

Pricing

There's a free tier to test with:

Plan	Price	Requests
Basic	Free	100/month
Pro	$5/month	3,000/month
Ultra	$15/month	20,000/month

When to Use This vs. youtube-transcript-api (Python Library)

The popular youtube-transcript-api Python library is great for local scripts. But if you're building a production app, you'll run into issues:

Rate limiting: YouTube blocks your server IP after ~100 requests/minute
Maintenance: YouTube changes their frontend regularly, breaking scrapers
Deployment: Installing the library adds dependencies to your stack

An API handles all of this for you — proxy rotation, YouTube changes, uptime — so you can focus on your app logic.

DEV Community