Nuco Z

Posted on Feb 27

I Built a Bulk YouTube Subtitle Extractor as a Solo Dev — Here's My Tech Stack and What I Learned

#nextjs #webdev #javascript #beginners

Every developer has that one side project that starts as a weekend hack and turns into something real. For me, it was a tool to bulk-extract YouTube subtitles.

It started with a simple frustration: I needed transcripts from an entire YouTube playlist for a research project. YouTube lets you copy transcripts one video at a time. I had 200 videos. That wasn't going to work.

So I built YTVidHub — a web app that extracts subtitles from YouTube videos, playlists, and entire channels in bulk. Paste a URL, pick your format (SRT, VTT, or TXT), and download everything as a ZIP.

Here's what the journey looked like from a technical perspective.

The Tech Stack

I wanted something that could ship fast, handle server-side rendering for SEO, and scale without DevOps headaches. Here's what I landed on:

Framework: Next.js 16 (App Router)
Language: TypeScript
Styling: Tailwind CSS
Database: PostgreSQL + Prisma ORM
Auth: JWT-based (custom, no third-party provider)
AI Features: DeepSeek API for video summarization
Payments: Stripe
Hosting: Vercel
i18n: next-intl (4 languages)

Why Next.js?

Two reasons: SEO and developer experience.

A subtitle downloader tool lives and dies by organic search traffic. Server-side rendering was non-negotiable. Next.js App Router gave me that plus React Server Components, which meant I could fetch data on the server without shipping unnecessary JavaScript to the client.

The file-based routing also made it easy to create dozens of SEO landing pages without a complex routing setup.

Why not use a third-party auth provider?

Cost. As a solo developer bootstrapping with zero funding, every dollar matters. Auth0 and Clerk are great, but their free tiers have limits that would bite me as I scale. A simple JWT + Google OAuth flow took me a day to implement and costs nothing to run.

// Simplified auth flow
const token = jwt.sign(
  { userId: user.id, email: user.email },
  process.env.JWT_SECRET,
  { expiresIn: '30d' }
);

Is it as feature-rich as Auth0? No. Does it work perfectly for my use case? Yes.

The Hard Part: Subtitle Extraction at Scale

The core technical challenge wasn't building the UI — it was reliably extracting subtitles from YouTube at scale.

Challenge 1: YouTube doesn't want you scraping subtitles

YouTube's official Data API doesn't provide subtitle content. You can get video metadata, but not the actual transcript text. The timedtext endpoint that serves subtitles to the YouTube player is undocumented and changes without notice.

My approach: I reverse-engineered the subtitle delivery mechanism and built a resilient fetcher that handles:

Multiple subtitle tracks per video (manual + auto-generated)
Language detection and selection
Rate limiting with exponential backoff
Graceful fallbacks when subtitles aren't available

The key insight was that YouTube serves subtitles as XML, which needs to be parsed and converted to standard formats:

// YouTube serves subtitles as XML like this:
// <text start="1.23" dur="4.5">Hello world</text>

function xmlToSrt(xml: string): string {
  const entries = parseXmlEntries(xml);
  return entries.map((entry, i) => {
    const start = formatTimestamp(entry.start);
    const end = formatTimestamp(entry.start + entry.duration);
    return `${i + 1}\n${start} --> ${end}\n${entry.text}\n`;
  }).join('\n');
}

Challenge 2: Playlist expansion

A YouTube playlist URL doesn't give you all the video IDs upfront. You need to paginate through the playlist API, which returns 50 items at a time. For a channel with 2,000+ videos, that's 40+ API calls just to get the video list.

I built a streaming pipeline that starts checking subtitles as soon as the first batch of video IDs arrives, rather than waiting for the full list. This makes the UX feel much faster — users see progress immediately.

Challenge 3: Bulk downloads without killing the server

When a user requests subtitles for 500 videos, you can't process them all synchronously. I implemented a queue-based system:

User submits playlist URL
Server expands the playlist and returns video list
Client requests subtitles in batches of 10
Each batch is processed in parallel on the server
Results are streamed back to the client
Final ZIP is assembled client-side

This keeps server memory usage flat regardless of playlist size.

SEO: The Make-or-Break for Tool Sites

For a free tool targeting organic traffic, SEO isn't a nice-to-have — it's the entire growth strategy. Here's what I did:

Programmatic SEO pages

I created dedicated landing pages for different search intents:

/youtube-subtitle-downloader — targets the generic search
/bulk-youtube-subtitle-downloader — targets the bulk/power user search
/what-is-an-srt-file — targets informational queries
/guide/* — long-form content for long-tail keywords

Each page has unique content, structured data (JSON-LD), and proper meta tags. Next.js makes this straightforward with generateMetadata:

export async function generateMetadata({ params }): Promise<Metadata> {
  const { locale } = await params;
  const t = await getTranslations({ locale, namespace: 'metadata' });

  return {
    title: t('title'),
    description: t('description'),
    alternates: {
      canonical: buildCanonicalUrl({ locale, pathname: '' }),
      languages: {
        'en': buildCanonicalUrl({ locale: 'en', pathname: '' }),
        'es': buildCanonicalUrl({ locale: 'es', pathname: '' }),
        // ...
      },
    },
  };
}

Internationalization for traffic

Adding support for Spanish, German, and Korean multiplied my addressable search volume by ~3x with relatively little effort. next-intl handles the routing and message loading, and I used AI to help with translations (then had native speakers review them).

What I got wrong

I initially created too many pages targeting similar keywords. Google got confused about which page to rank, and none of them ranked well. This is called "keyword cannibalization" — a lesson I learned the hard way. I'm now consolidating pages and using 301 redirects to focus authority.

Monetization: The Credits Model

I went with a credits-based system instead of a traditional subscription:

Free tier: 5 credits per day (1 credit = 1 video subtitle download)
Daily reward: +3 bonus credits for returning daily
Paid plans: Credit packs via Stripe for heavy users

Why credits instead of subscriptions? Two reasons:

Lower barrier to entry. Users can try the tool without committing to a monthly fee. Most people only need subtitles occasionally.
Natural upsell. When someone hits their limit mid-workflow, the friction of buying credits is much lower than subscribing. They're already invested in the task.

Stripe integration with Next.js was surprisingly smooth:

const session = await stripe.checkout.sessions.create({
  line_items: [{ price: priceId, quantity: 1 }],
  mode: 'payment',
  success_url: `${origin}/pricing?success=true`,
  cancel_url: `${origin}/pricing?canceled=true`,
  metadata: { userId, credits: creditAmount },
});

The webhook handles credit allocation after payment confirmation. Simple, reliable, no subscription management complexity.

Adding AI: The Unexpected Differentiator

The original product was just a subtitle downloader. But I noticed users were downloading transcripts to feed them into ChatGPT for summaries. So I built it in.

Now users can paste a YouTube URL and get:

An AI-generated summary of the video
Key points extraction
A mind map of the content structure

I used DeepSeek's API for this — it's significantly cheaper than OpenAI for long-context tasks, and the quality is excellent for summarization. The cost per summary is roughly $0.002, which makes it viable even on the free tier.

This feature turned out to be the biggest differentiator. There are dozens of YouTube subtitle downloaders, but very few that combine bulk extraction with AI analysis.

Lessons Learned After 3 Months

What worked

Solving my own problem. I built this because I needed it. That meant I understood the user deeply from day one.
Next.js + Vercel. Zero DevOps overhead. I deploy with git push and never think about servers.
Credits model. Lower friction than subscriptions for a tool that people use sporadically.
Internationalization early. Adding i18n from the start was much easier than retrofitting it later.

What I'd do differently

Start with fewer pages. I created 30+ SEO pages before validating which keywords actually had traffic. Should have started with 3-5 and expanded based on data.
Build in public sooner. I spent 2 months building in silence. Sharing progress on Twitter and dev communities from week 1 would have built an audience before launch.
Focus on backlinks from day one. On-page SEO is necessary but not sufficient. Without quality backlinks, Google won't rank you no matter how good your content is.

Numbers (honest)

Monthly visitors: ~600 (growing slowly)
Paying users: A handful
Revenue: Enough for coffee, not rent
Time invested: ~400 hours over 3 months

Is it a success? Not yet. But it's a real product that real people use, and the foundation is solid. The growth curve for SEO-driven tools is slow at first and compounds over time. I'm playing the long game.

If You're Building a Similar Tool

A few tactical tips:

Use Next.js App Router if SEO matters. The metadata API and server components are a huge advantage over client-side frameworks.
Don't over-engineer auth. JWT + OAuth is enough for most indie projects. You can always migrate to a provider later.
Ship the MVP in 2 weeks, then iterate. My biggest regret is spending too long on features nobody asked for.
Track everything from day one. Google Analytics + Search Console + Microsoft Clarity. You can't improve what you can't measure.
Write content that helps people. The best SEO strategy is genuinely useful content. Google is surprisingly good at detecting fluff.

If you're working on a side project or building in public, I'd love to connect. Drop a comment about what you're building — I read every one.

You can check out the tool at ytvidhub.com if you ever need YouTube subtitles in bulk.