DEV Community

Cover image for Build a YouTube Video Summarizer in 15 Minutes (Next.js + OpenAI)
Olamide Olaniyan
Olamide Olaniyan

Posted on

Build a YouTube Video Summarizer in 15 Minutes (Next.js + OpenAI)

We've all been there. You see a 2-hour podcast titled "The Future of AI is Here". You want the insights, but you don't have 2 hours.

Sure, there are tools that do this. But why pay $20/month when you can build your own in 15 minutes?

In this tutorial, we'll build a YouTube Video Summarizer that:

  1. Takes a YouTube URL.
  2. Extracts the full transcript (even if captions are auto-generated).
  3. Uses GPT-4 to summarize it into bullet points.

The Stack

  • Frontend: Next.js 14 (App Router)
  • AI: OpenAI API (GPT-4o-mini)
  • Data: SociaVault API (YouTube Transcript Extraction)

Why not just use youtube-dl?

You could use youtube-dl or ytdl-core, but YouTube constantly changes their DOM and rate-limits server-side requests. If you deploy a ytdl-core app to Vercel, it will likely get blocked immediately because the IP is flagged.

We'll use SociaVault because it handles the proxies and rotation for us.

Step 1: Setup

Create a new Next.js app:

npx create-next-app@latest yt-summarizer
cd yt-summarizer
npm install openai
Enter fullscreen mode Exit fullscreen mode

Get your API keys:

  1. OpenAI Key: platform.openai.com
  2. SociaVault Key: sociavault.com (Free tier works fine)

Add them to .env.local:

OPENAI_API_KEY=sk-...
SOCIAVAULT_API_KEY=sv_...
Enter fullscreen mode Exit fullscreen mode

Step 2: The Backend Route

We need a server-side route to handle the secrets. Create app/api/summarize/route.ts.

This route does two things:

  1. Fetches the transcript.
  2. Sends it to OpenAI.
import { NextResponse } from "next/server";
import OpenAI from "openai";

const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

export async function POST(req: Request) {
  const { videoUrl } = await req.json();

  // 1. Extract Video ID
  const videoId = videoUrl.split("v=")[1]?.split("&")[0];
  if (!videoId) return NextResponse.json({ error: "Invalid URL" }, { status: 400 });

  try {
    // 2. Get Transcript from SociaVault
    const transcriptRes = await fetch(
      `https://api.sociavault.com/api/v1/youtube/video/${videoId}/transcript`,
      {
        headers: { "x-api-key": process.env.SOCIAVAULT_API_KEY! },
      }
    );

    if (!transcriptRes.ok) throw new Error("Failed to fetch transcript");

    const data = await transcriptRes.json();

    // Combine segments into one text block
    const fullText = data.transcript
      .map((item: any) => item.text)
      .join(" ")
      .slice(0, 15000); // Limit length for token budget

    // 3. Summarize with OpenAI
    const completion = await openai.chat.completions.create({
      model: "gpt-4o-mini",
      messages: [
        {
          role: "system",
          content: "You are a helpful assistant. Summarize the following YouTube video transcript into 5 key bullet points. Be concise.",
        },
        { role: "user", content: fullText },
      ],
    });

    return NextResponse.json({ 
      summary: completion.choices[0].message.content 
    });

  } catch (error) {
    return NextResponse.json({ error: "Something went wrong" }, { status: 500 });
  }
}
Enter fullscreen mode Exit fullscreen mode

Step 3: The Frontend

Now, a simple UI to take the input. app/page.tsx:

"use client";
import { useState } from "react";

export default function Home() {
  const [url, setUrl] = useState("");
  const [summary, setSummary] = useState("");
  const [loading, setLoading] = useState(false);

  const handleSummarize = async () => {
    setLoading(true);
    const res = await fetch("/api/summarize", {
      method: "POST",
      body: JSON.stringify({ videoUrl: url }),
    });
    const data = await res.json();
    setSummary(data.summary);
    setLoading(false);
  };

  return (
    <main className="min-h-screen flex flex-col items-center justify-center p-24 bg-gray-50">
      <div className="max-w-2xl w-full space-y-8">
        <h1 className="text-4xl font-bold text-center text-gray-900">
          📺 YouTube Summarizer
        </h1>

        <div className="flex gap-4">
          <input
            type="text"
            placeholder="Paste YouTube URL..."
            className="flex-1 p-4 rounded-lg border border-gray-300"
            value={url}
            onChange={(e) => setUrl(e.target.value)}
          />
          <button
            onClick={handleSummarize}
            disabled={loading}
            className="bg-blue-600 text-white px-8 py-4 rounded-lg hover:bg-blue-700 disabled:opacity-50"
          >
            {loading ? "Thinking..." : "Summarize"}
          </button>
        </div>

        {summary && (
          <div className="bg-white p-8 rounded-xl shadow-sm border prose">
            <h3 className="text-xl font-semibold mb-4">Summary</h3>
            <div className="whitespace-pre-wrap">{summary}</div>
          </div>
        )}
      </div>
    </main>
  );
}
Enter fullscreen mode Exit fullscreen mode

Testing It Out

Run npm run dev and paste a URL.

I tested it on a 45-minute Lex Fridman podcast.
Result: It extracted the transcript in ~2 seconds and generated a summary in ~5 seconds.

Total cost per run?

  • SociaVault: Free tier covers 50 requests.
  • OpenAI: ~$0.01 per summary.

Next Steps

You can take this further:

  • Chat with Video: Store the transcript in a vector DB (Pinecone) and use RAG to ask questions like "What did he say about Aliens?".
  • Timestamp Linking: The API returns timestamps for every sentence. You could make the summary bullets clickable to jump to that part of the video.

Happy building!

Resources:

Top comments (0)