DEV Community

Ramandeep Singh
Ramandeep Singh

Posted on

TL;DW: Summarize YouTube Videos with OpenAI!

Building a YouTube Video Summarizer with Python and OpenAI

In today's fast-paced world, efficiently extracting key information from lengthy YouTube videos can be invaluable. In this tutorial, we'll build a Python application that generates concise summaries of YouTube videos using only their URLs. We'll leverage the OpenAI API for summarization and the youtube-transcript-api for transcript extraction.

Prerequisites

Before we begin, ensure you have the following:

  • Python 3.7+ installed on your system.
  • An API key from OpenAI. You can obtain one by signing up at OpenAI.
  • The following Python packages:
    • openai
    • youtube-transcript-api

You can install the required packages using pip:

pip install openai youtube-transcript-api
Enter fullscreen mode Exit fullscreen mode

Application Overview

Our application will:

  1. Extract the Video ID from a given YouTube URL.
  2. Retrieve the Transcript of the video using the youtube-transcript-api.
  3. Summarize the Transcript using the OpenAI API.

Step 1: Extracting the Video ID

First, we'll extract the unique video ID from the YouTube URL. This ID is essential for fetching the video's transcript.

import re

def get_videoId(youtube_link):
    """
    Extracts the YouTube video ID from the given URL.
    """
    pattern = r"(?<=v=).+"
    match = re.search(pattern, youtube_link)
    if match:
        print(match.group())
        return match.group()
    else:
        return -1
Enter fullscreen mode Exit fullscreen mode

Step 2: Retrieving the Transcript

With the video ID, we can fetch the video's transcript. The youtube-transcript-api simplifies this process by handling YouTube's internal API calls.

from youtube_transcript_api import YouTubeTranscriptApi

def get_transcript(videoid):
    """
    Retrieves the transcript for the given YouTube video ID.
    """
    transcript = YouTubeTranscriptApi.get_transcript(videoid)
    data = ""
    for item in transcript:
        data = data +" "+ item['text']
    return data
    ```
{% endraw %}


## Step 3: Summarizing the Transcript

We'll use OpenAI's language model to generate a summary of the transcript. Ensure your OpenAI API key is set as an environment variable for security.
{% raw %}


```python
from openai import OpenAI

# Set your OpenAI API key
client = OpenAI(api_key= "your-api-key")

def summary_extraction(transcription):
    """
    Retrieves the Summary of YouTube video.
    """
    response = client.chat.completions.create(
        model="gpt-4",
        temperature=0,
        messages=[
            {
                "role": "system",
                "content": "You are a highly skilled AI trained in language comprehension and summarization. I would like you to read the following text and summarize it into a concise abstract paragraph. Aim to retain the most important points, providing a coherent and readable summary that could help a person understand the main points of the discussion without needing to read the entire text. Please avoid unnecessary details or tangential points."
            },
            {
                "role": "user",
                "content": transcription
            }
        ]
    )
    return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

Let's combine these functions into a cohesive script that takes a YouTube URL and outputs a summary.

def start():
    youtube_link = input("Enter the youtube video link: ")
    videoid = get_videoId(youtube_link)
    if videoid == -1:
         print("Please enter correct Youtube video link")
         print("*******************************************************")
         start()
    else:
        transcription = create_transcript(videoid)
        if len(transcription) >0:
            print("*******************************************************")
            print("Video Summary")
            print("*******************************************************")
            print(summary_extraction(transcription))

start()
Enter fullscreen mode Exit fullscreen mode

Running the Application

  1. Set your OpenAI API key as an environment variable:
   export OPENAI_API_KEY='your-api-key-here'
Enter fullscreen mode Exit fullscreen mode
  1. Run the script:
   python app.py
Enter fullscreen mode Exit fullscreen mode
  1. Input the YouTube URL when prompted.
  2. Specify the desired language for the summary (default is English).

The application will display a concise summary of the video's content.

Conclusion

You've now built a Python application that extracts and summarizes YouTube video transcripts using AI. This tool can save time and provide quick insights into video content without the need to watch the entire video.

For more details and the complete code, visit the GitHub repository.

Note: Ensure you comply with YouTube's terms of service and OpenAI's usage policies when using this application.

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

👋 Kindness is contagious

Discover a treasure trove of wisdom within this insightful piece, highly respected in the nurturing DEV Community enviroment. Developers, whether novice or expert, are encouraged to participate and add to our shared knowledge basin.

A simple "thank you" can illuminate someone's day. Express your appreciation in the comments section!

On DEV, sharing ideas smoothens our journey and strengthens our community ties. Learn something useful? Offering a quick thanks to the author is deeply appreciated.

Okay