DEV Community

Walnut Serv
Walnut Serv

Posted on

How to Fetch YouTube Transcripts for AI Summarization and RAG

If you're building AI apps that summarize YouTube videos, power RAG over video content, or generate subtitles, you need reliable transcript text — not browser scraping that breaks every other week.

This guide shows a simple REST approach that returns JSON with timestamps, plain text, or raw timed cues.

Why transcripts matter for AI

  • Summarization — feed transcript text to GPT/Claude instead of sending video
  • RAG — chunk timed cues into a vector database for semantic search
  • Accessibility — build caption tools without manual SRT editing
  • Content indexing — search across a channel's spoken content

Quick start with curl

curl "https://get-youtube-transcript.p.rapidapi.com/transcript?video_id=jNQXAC9IVRw&format=json" \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: get-youtube-transcript.p.rapidapi.com"
Enter fullscreen mode Exit fullscreen mode

Replace YOUR_KEY with a key from the YouTube Transcript API on RapidAPI. The Basic plan includes 100 free requests/month.

Python example

import requests

API_URL = "https://get-youtube-transcript.p.rapidapi.com/transcript"
headers = {
    "X-RapidAPI-Key": "YOUR_KEY",
    "X-RapidAPI-Host": "get-youtube-transcript.p.rapidapi.com",
}
params = {"video_id": "jNQXAC9IVRw", "format": "json"}

data = requests.get(API_URL, headers=headers, params=params, timeout=60).json()

for cue in data["transcript"]:
    print(f"[{cue['start']:.1f}s] {cue['text']}")
Enter fullscreen mode Exit fullscreen mode

Response formats

format Use case
json LLM pipelines, metadata + timestamps
text Simple summarization
raw Subtitle/SRT workflows

Parameters

  • video_id — 11-char YouTube ID or pass url with full YouTube link
  • languages — comma-separated codes, e.g. en,pt

Production tip

For batch jobs (indexing hundreds of videos), upgrade to the Ultra plan on RapidAPI — 100k requests/month at $9. Test in the playground first with any public video ID.


Disclosure: I built this API. Feedback welcome — especially on languages, latency, and batch endpoints.

Try it: https://rapidapi.com/wrt/api/get-youtube-transcript

Top comments (0)