DEV Community

Bask
Bask

Posted on

SubDownload — plug YouTube into Claude, Cursor, and ChatGPT via MCP

TL;DR — SubDownload turns any YouTube video into searchable text: transcripts, AI summaries, a personal knowledge base, plus an MCP server, REST API, and Agent Skills so Claude / Cursor / ChatGPT / 40+ MCP clients can read YouTube directly. Try it — free, no signup needed for the basic tool.


The shape problem with YouTube

I run my dev work alongside Claude, Cursor, and a handful of agent loops. Every input source I rely on — code, docs, papers, RSS, my own notes — pipes into an agent. Except YouTube.

I watch a lot of long-form content for learning: founder interviews, conference talks, infra deep-dives, podcasts. The best stuff often lives in 90-minute videos. And every minute of that content is locked behind playback. My agent can't see any of it.

The standard advice is "watch less, be selective." Wrong frame. I want to consume more of this content — just not by sitting through every minute in real time while I can only do that one thing.

The actual problem isn't quantity. It's that videos exist in only one shape: playback. They're not searchable, not summarizable on demand, not addressable by an agent.

What SubDownload does

It changes the shape:

  • Transcripts for any YouTube video, including ones without official captions (AI ASR fallback covers Loom recordings, conference talks, niche uploads — the half my workflow used to lose)
  • AI summary in seconds — read the gist of a 90-minute talk in 60 seconds, decide whether it gets a real watch
  • Personal knowledge base — every video you process lands in a searchable library you own, with tags, favorites, cross-video query
  • MCP server at api.subdownload.com/mcp — plug into Claude Desktop, Cursor, ChatGPT, or any of the 40+ MCP-capable clients
  • REST API + full OpenAPI 3.1 if you want to roll your own scripts
  • Agent Skills: npx @subdown/skill@latest for one-command install

Why MCP first

I shipped MCP before a Chrome extension or a polished SaaS UI because it's the cleanest way for an agent to read videos on demand. The agent says "search this channel, fetch this transcript, summarize this part" — done. No copy-paste, no wrapper UI per host, no "open YouTube in a tab and switch back."

OAuth 2.1 with Dynamic Client Registration (RFC 7591) is supported, so adding the server to Claude Desktop is one config block plus a one-time browser auth — then it stays connected.

What this looks like in practice

A few uses I lean on weekly:

  • Find a half-remembered phrase from an interview I watched last month. I ask Claude in 30 seconds and it lands me on the exact clip.
  • Triage a 90-minute talk. Claude reads the transcript, gives me a timestamped outline, I decide which 10 minutes to actually watch.
  • Cross-video synthesis. "What do these five founder interviews say about pricing?" — agent reads multiple transcripts, returns a comparison.
  • Caption-less videos. Loom screencasts, brand-new uploads, niche channels — AI transcription fills the gap automatically.

Try it

Free to try, no signup needed for the basic tool: subdownload.com

For MCP with Claude Desktop / Cursor / ChatGPT, sign in once for an API key and follow your client's MCP setup docs. (Intentionally not pasting a JSON config — MCP client formats keep evolving and any snippet I share will go stale within weeks.)

What I want to hear

  • If you're already running an agent loop, what video → text gap have you hit?
  • Anyone running a homemade MCP server for a video host I haven't covered? Always curious about edge cases.
  • If something breaks, ping me — I'm the only one shipping this.

Top comments (0)