DEV Community: Nic Bars

Building a YouTube Transcript Extraction Service: From Idea to 10K+ Monthly Users

Nic Bars — Thu, 10 Jul 2025 23:08:33 +0000

Building a YouTube Transcript Extraction Service: From Idea to 10K+ Monthly Users

Hey developers! 👋

A few months ago, I got frustrated paying $50/month for simple YouTube transcript extraction tools. As a developer, I thought "how hard can this be?" - famous last words, right?

Turns out, building a robust YouTube transcript service taught me more about web scraping, API rate limits, and user experience than I expected. Here's how I built it from scratch.

The Problem I Was Solving

Most transcript tools either:

Cost way too much ($30-50/month)
Have terrible UX with ads everywhere
Don't preserve timestamps
Can't handle different languages
Break when YouTube changes their structure

I wanted something clean, fast, and free. So I built YouTubeNavigator.com

Tech Stack Overview

Frontend:

Next.js 14 (App Router)
TypeScript
Tailwind CSS
React Hook Form

Backend:

Next.js API Routes
Node.js
YouTube Transcript API
Vercel for deployment

Key Libraries:

npm install youtube-transcript
npm install get-video-id
npm install react-youtube

Step 1: Understanding YouTube's Transcript System

YouTube stores transcripts in a specific format that's not immediately obvious. Here's what I learned:

// Basic transcript fetching
import { YoutubeTranscript } from 'youtube-transcript';

async function getTranscript(videoId) {
  try {
    const transcript = await YoutubeTranscript.fetchTranscript(videoId);
    return transcript;
  } catch (error) {
    console.error('Transcript fetch failed:', error);
    throw new Error('No transcript available');
  }
}

The tricky part? YouTube has multiple transcript formats:

Auto-generated captions
Manual captions
Different languages
Various quality levels

Step 2: Building the API Endpoint

Here's my main API route that handles the heavy lifting:

// app/api/fetch-transcript/route.ts
import { NextRequest, NextResponse } from 'next/server';
import { YoutubeTranscript } from 'youtube-transcript';
import getVideoId from 'get-video-id';

export async function GET(request: NextRequest) {
  const { searchParams } = new URL(request.url);
  const url = searchParams.get('url');
  const includeTimestamps = searchParams.get('timestamps') === 'true';

  if (!url) {
    return NextResponse.json({ error: 'URL is required' }, { status: 400 });
  }

  try {
    // Extract video ID from various YouTube URL formats
    const videoData = getVideoId(url);
    if (!videoData.id) {
      throw new Error('Invalid YouTube URL');
    }

    // Fetch transcript with error handling
    const transcriptData = await YoutubeTranscript.fetchTranscript(videoData.id, {
      lang: 'en', // Default to English, but we can handle multiple languages
    });

    if (includeTimestamps) {
      // Format with timestamps preserved
      const formattedTranscript = transcriptData.map(item => ({
        timestamp: formatTime(item.offset),
        text: item.text,
        startTimeMs: item.offset
      }));

      return NextResponse.json({
        transcript: formattedTranscript.map(item => 
          `${item.timestamp} ${item.text}`
        ).join('\n'),
        segments: formattedTranscript
      });
    }

    // Plain text version
    const plainText = transcriptData.map(item => item.text).join(' ');

    return NextResponse.json({ transcript: plainText });

  } catch (error) {
    console.error('Transcript extraction failed:', error);
    return NextResponse.json(
      { error: 'Failed to extract transcript. Video may not have captions.' },
      { status: 500 }
    );
  }
}

function formatTime(milliseconds) {
  const totalSeconds = Math.floor(milliseconds / 1000);
  const minutes = Math.floor(totalSeconds / 60);
  const seconds = totalSeconds % 60;
  return `${minutes}:${seconds.toString().padStart(2, '0')}`;
}

Step 3: Frontend Implementation

The frontend needed to be dead simple. Here's the core component:

// components/TranscriptExtractor.tsx
'use client';

import { useState } from 'react';
import { toast } from 'react-hot-toast';

export default function TranscriptExtractor() {
  const [url, setUrl] = useState('');
  const [transcript, setTranscript] = useState('');
  const [loading, setLoading] = useState(false);

  const handleSubmit = async (e) => {
    e.preventDefault();
    if (!url) return;

    setLoading(true);
    try {
      const response = await fetch(
        `/api/fetch-transcript?url=${encodeURIComponent(url)}&timestamps=true`
      );

      if (!response.ok) {
        const error = await response.json();
        throw new Error(error.error);
      }

      const data = await response.json();
      setTranscript(data.transcript);
      toast.success('Transcript extracted successfully!');

    } catch (error) {
      toast.error(error.message);
    } finally {
      setLoading(false);
    }
  };

  return (
    <div className="max-w-4xl mx-auto p-6">
      <form onSubmit={handleSubmit} className="mb-8">
        <div className="flex gap-4">
          <input
            type="url"
            value={url}
            onChange={(e) => setUrl(e.target.value)}
            placeholder="Paste YouTube URL here..."
            className="flex-1 px-4 py-2 border rounded-lg"
            required
          />
          <button
            type="submit"
            disabled={loading}
            className="px-6 py-2 bg-blue-600 text-white rounded-lg disabled:opacity-50"
          >
            {loading ? 'Extracting...' : 'Get Transcript'}
          </button>
        </div>
      </form>

      {transcript && (
        <div className="bg-gray-50 p-6 rounded-lg">
          <pre className="whitespace-pre-wrap text-sm">
            {transcript}
          </pre>
        </div>
      )}
    </div>
  );
}

Step 4: Adding Advanced Features

Interactive Timestamps

One feature that sets my transcript tool apart is clickable timestamps that jump to video positions:

// Enhanced transcript display with video integration
import YouTube from 'react-youtube';

function InteractiveTranscript({ segments, videoId }) {
  const [player, setPlayer] = useState(null);

  const seekToTime = (timeInSeconds) => {
    if (player) {
      player.seekTo(timeInSeconds, true);
      player.playVideo();
    }
  };

  return (
    <div className="grid grid-cols-1 lg:grid-cols-2 gap-6">
      {/* YouTube Player */}
      <div>
        <YouTube
          videoId={videoId}
          onReady={(event) => setPlayer(event.target)}
          opts={{
            height: '400',
            width: '100%',
            playerVars: { autoplay: 0, controls: 1 }
          }}
        />
      </div>

      {/* Interactive Transcript */}
      <div className="max-h-96 overflow-y-auto">
        {segments.map((segment, index) => (
          <div key={index} className="mb-4 group">
            <button
              onClick={() => seekToTime(Math.floor(segment.startTimeMs / 1000))}
              className="text-blue-600 hover:text-blue-800 font-mono text-sm mb-1 flex items-center gap-2"
            >
              <PlayIcon className="w-3 h-3 opacity-0 group-hover:opacity-100" />
              {segment.timestamp}
            </button>
            <p className="text-gray-800 text-sm ml-5">
              {segment.text}
            </p>
          </div>
        ))}
      </div>
    </div>
  );
}

Multiple Export Formats

Users wanted different formats, so I added SRT subtitle generation:

function generateSRT(segments) {
  return segments.map((segment, index) => {
    const startTime = formatTimeForSRT(segment.startTimeMs);
    const endTime = formatTimeForSRT(segment.startTimeMs + 3000); // 3 second duration

    return `${index + 1}
${startTime} --> ${endTime}
${segment.text}
`;
  }).join('\n');
}

function formatTimeForSRT(milliseconds) {
  const totalSeconds = Math.floor(milliseconds / 1000);
  const hours = Math.floor(totalSeconds / 3600);
  const minutes = Math.floor((totalSeconds % 3600) / 60);
  const seconds = totalSeconds % 60;
  const ms = milliseconds % 1000;

  return `${hours.toString().padStart(2, '0')}:${minutes.toString().padStart(2, '0')}:${seconds.toString().padStart(2, '0')},${ms.toString().padStart(3, '0')}`;
}

Step 5: Handling Edge Cases

Real-world usage taught me about edge cases:

URL Validation

function isValidYouTubeUrl(url) {
  const patterns = [
    /^https?:\/\/(www\.)?youtube\.com\/watch\?v=[\w-]+/,
    /^https?:\/\/youtu\.be\/[\w-]+/,
    /^https?:\/\/(www\.)?youtube\.com\/embed\/[\w-]+/
  ];

  return patterns.some(pattern => pattern.test(url));
}

Rate Limiting

// Simple in-memory rate limiting
const rateLimiter = new Map();

function checkRateLimit(ip) {
  const now = Date.now();
  const windowMs = 60 * 1000; // 1 minute
  const maxRequests = 10;

  if (!rateLimiter.has(ip)) {
    rateLimiter.set(ip, { count: 1, resetTime: now + windowMs });
    return true;
  }

  const limit = rateLimiter.get(ip);
  if (now > limit.resetTime) {
    limit.count = 1;
    limit.resetTime = now + windowMs;
    return true;
  }

  if (limit.count >= maxRequests) {
    return false;
  }

  limit.count++;
  return true;
}

Step 6: Performance Optimizations

Caching Strategy

// Redis-like caching for transcripts
const cache = new Map();
const CACHE_TTL = 24 * 60 * 60 * 1000; // 24 hours

function getCachedTranscript(videoId) {
  const cached = cache.get(videoId);
  if (cached && Date.now() - cached.timestamp < CACHE_TTL) {
    return cached.data;
  }
  return null;
}

function setCachedTranscript(videoId, data) {
  cache.set(videoId, {
    data,
    timestamp: Date.now()
  });
}

Lazy Loading Components

// Dynamic imports for better performance
import dynamic from 'next/dynamic';

const TranscriptResult = dynamic(() => import('./TranscriptResult'), {
  ssr: false,
  loading: () => <div>Loading transcript...</div>
});

Step 7: SEO and Discoverability

Since I wanted the YouTube transcript tool to rank well, I focused on SEO:

// SEO-optimized metadata
export const metadata = {
  title: 'Free YouTube Transcript Extractor - Download Video Transcripts',
  description: 'Extract YouTube video transcripts for free. Download as TXT, SRT with timestamps. No signup required.',
  keywords: 'youtube transcript, video transcript, subtitle extractor, free transcript tool',
  openGraph: {
    title: 'YouTube Transcript Extractor',
    description: 'Free tool to extract and download YouTube video transcripts',
    url: 'https://youtubenavigator.com/youtube-transcript',
  }
};

Lessons Learned

Start Simple: My first version was just a form and text area. Added features based on user feedback.
Error Handling is Critical: YouTube's API can be unpredictable. Robust error handling saved me countless support requests.
Performance Matters: Caching reduced API calls by 80% and improved response times significantly.
User Experience Wins: The interactive timestamps feature got more positive feedback than anything else.
SEO Takes Time: It took 3 months to start ranking for "YouTube transcript" keywords.

Current Stats

After 6 months, the transcript extraction service now handles:

10,000+ monthly active users
500,000+ transcripts extracted
99.2% uptime
Average response time: 1.2 seconds

What's Next?

I'm working on:

Multi-language transcript support
Batch processing for multiple videos
API access for developers
Integration with popular note-taking apps

Try It Yourself

Want to see it in action? Check out the YouTube Transcript Extractor — it's completely free and no signup required.

BTW, I’ve also built a few other tools that complement it:

The full source code concepts I've shared here should give you a solid foundation for building your own transcript service. The key is starting simple and iterating based on real user needs.

Resources and Links

Building this YouTube transcript service taught me that sometimes the best products come from solving your own problems. What developer tool will you build next?

Have questions about implementing any of these features? Drop them in the comments! I'm always happy to help fellow developers build cool stuff.

Tags: #webdev #javascript #nextjs #youtube #api #typescript #react

How I Built a YouTube to MP3/MP4 Converter Web Service

Nic Bars — Thu, 17 Oct 2024 19:30:34 +0000

As an avid tech enthusiast and a believer in creating user-friendly tools, I set out to create something that many people often search for—a simple and efficient YouTube video-to-audio/video converter. After hours of brainstorming, coding, and some dead ends, I successfully built a web service that lets you convert YouTube videos into MP3 or MP4 formats, making it easier than ever to download your favorite content. Sounds simple enough, right? Well, it wasn't all smooth sailing. Here's a deep dive into the architecture, challenges, and technology choices behind the project.

The Problem I Wanted to Solve

We all know YouTube is an amazing platform with loads of great content, but sometimes you just need to download a video for offline use or convert it into an audio file (maybe for a podcast or offline listening). However, not all solutions available online provide a seamless experience. Many of them are riddled with annoying ads or worse—malware. So, I wanted to create something different, something clean, efficient, and trustworthy, without all the hassle.

And that’s how the idea for ytb2mp4.com was born. A straightforward, no-nonsense web service to convert YouTube videos into MP3/MP4 formats.

Technologies Used

When deciding on the tech stack, I needed something that was not only scalable but also robust enough to handle large amounts of requests efficiently. Here's what I used to build the service:

Frontend: Built using Next.js and React, with a sprinkle of TailwindCSS for styling. Next.js gave me the flexibility to handle both static and server-rendered content, which was important for SEO (search engine optimization) and user experience.

Backend: Nodejs with express was my go-to choice for the backend. Nodejs is lightweight, easy to use, and has a massive ecosystem of extensions. It allowed me to keep the backend simple yet powerful, handling API requests and managing conversions efficiently.

Youtube Data Extraction: This is where things got tricky. Initially, I planned to use libraries like yt-dlp or pytube for extracting and converting YouTube data. Unfortunately, YouTube’s restrictions block many open-source libraries from making reliable requests. After several attempts with these libraries, I switched to using youtubei.js, which, when paired with cookie-based authentication, allows me to bypass these restrictions effectively. More on this later.

Deployment: I used Vercel for hosting the frontend and Fly for the Nodejs API server. Both services offer easy deployment pipelines, and their free tiers are perfect for projects like this.

High-Level Architecture

Keeping it simple but scalable. I wanted the architecture to be as simple as possible, without compromising scalability or performance. Here’s a high-level breakdown:

Frontend
The frontend is where the user interacts with the service. I chose Next.js because of its flexibility to serve both static and server-side content. It also supports API routes, which allow you to build out your entire web app with a single framework. For styling, I opted for TailwindCSS—its utility-first approach made it super easy to design a clean, responsive interface quickly.

Backend
The Nodejs/Express backend handles the core logic of the application. Users submit a YouTube URL, and the backend takes care of fetching the relevant data, converting it, and offering a download link. Here's where the backend gets interesting.

I initially tried using the yt-dlp and pytube libraries to directly interact with YouTube’s data. However, YouTube has gotten pretty smart at blocking requests coming from such libraries. After trying different approaches and receiving multiple “403 Forbidden” errors, I had to pivot.

API Integration with youtubei.js and Node.js for YouTube Data Extraction

Switching to a custom solution using node.js and youtubei.js has been a significant improvement. While yt-dlp and pytube posed challenges due to frequent changes in YouTube’s API, youtubei.js provides a robust alternative for data extraction by leveraging cookie authentication to maintain stable access to YouTube’s data.

Here's the updated flow:

User Submission: When a user submits a YouTube URL, the backend extracts the video ID.
Data Fetching with youtubei.js: The backend utilizes youtubei.js to retrieve all necessary video details—such as title, duration, and available formats—through authenticated requests. This approach bypasses common restrictions and delivers reliable results.
User Selection: Based on the user’s preference (MP3 or MP4), the Nodejs backend with youtubei.js behind for video download prepares the corresponding file.
Download Link: Once the file is ready, a download link is generated and sent back to the Nextjs client.

This shift to youtubei.js has allowed for more consistent data extraction without the need for external API providers.

Other Challenges and How I Solved Them

File Naming and Audio-Only Files
One common complaint with such services is the weird file names generated after downloading. I tackled this by using the video’s title for the file name—making it easier for users to organize their downloads. Another issue was ensuring that the MP3 files didn't contain any video streams. I took care to handle this with correct post-processing during the conversion.

YouTube Blocking Requests
YouTube has implemented various measures to restrict access to its data, which blocks many popular open-source libraries like yt-dlp and pytube from making consistent requests. After a lot of trial and error, I discovered that the key to reliable YouTube data access lay in authenticating requests with cookies.

Initially, I relied on RapidAPI to work around these limitations, but transitioning to a direct, cookie-authenticated approach with youtubei.js allowed me to avoid third-party services and gain more control over the request process.

How Cookie Authentication Works with YouTube

To bypass YouTube’s blocks, you can use session cookies from an active YouTube account. By including these cookies in each request, we can mimic a legitimate browser session, which provides reliable access to video data while staying compliant with YouTube’s access rules.

Including cookies in this way allows youtubei.js to make requests that look like they’re coming from a logged-in user, bypassing many of YouTube’s automated blocks on anonymous or bot-like traffic.

Cross-Origin Issues
When interacting with different APIs and services, CORS (Cross-Origin Resource Sharing) issues are quite common. I made sure to configure proper CORS policies in both the frontend and backend to avoid errors when fetching and downloading data from different origins.

Why Use ytb2mp4.com?
There are plenty of reasons why my service stands out among others:

Free: 100% free to use, with no hidden costs or annoying ads.
High Quality: Offers the best available MP3 and MP4 quality, including 4K video downloads.
No Annoying Ads: Unlike many other services, I’ve made it a point to keep the platform ad-free. All I ask is for a small donation if you like the service!
Easy to Use: The interface is clean and straightforward. Even if you’re not tech-savvy, you’ll find the service easy to use.

If you want a hassle-free experience converting your favorite YouTube videos into MP3 or MP4 formats, check out ytb2mp4.com today. I’m continuously working on improving the service, and I’d love to hear your feedback!

Final Thoughts
Creating ytb2mp4.com was both a challenge and a joy. From figuring out how to bypass YouTube's restrictions to fine-tuning the user experience, this project pushed me to my limits, but it was worth every minute. I hope you find it as useful as I do, and I’m excited to keep improving it as more users come on board!

Feel free to try out the service and let me know what you think!