Building a YouTube Video Summarizer with Python and OpenAI
In today's fast-paced world, efficiently extracting key information from lengthy YouTube videos can be invaluable. In this tutorial, we'll build a Python application that generates concise summaries of YouTube videos using only their URLs. We'll leverage the OpenAI API for summarization and the youtube-transcript-api
for transcript extraction.
Prerequisites
Before we begin, ensure you have the following:
- Python 3.7+ installed on your system.
- An API key from OpenAI. You can obtain one by signing up at OpenAI.
- The following Python packages:
openai
youtube-transcript-api
You can install the required packages using pip:
pip install openai youtube-transcript-api
Application Overview
Our application will:
- Extract the Video ID from a given YouTube URL.
-
Retrieve the Transcript of the video using the
youtube-transcript-api
. - Summarize the Transcript using the OpenAI API.
Step 1: Extracting the Video ID
First, we'll extract the unique video ID from the YouTube URL. This ID is essential for fetching the video's transcript.
import re
def get_videoId(youtube_link):
"""
Extracts the YouTube video ID from the given URL.
"""
pattern = r"(?<=v=).+"
match = re.search(pattern, youtube_link)
if match:
print(match.group())
return match.group()
else:
return -1
Step 2: Retrieving the Transcript
With the video ID, we can fetch the video's transcript. The youtube-transcript-api
simplifies this process by handling YouTube's internal API calls.
from youtube_transcript_api import YouTubeTranscriptApi
def get_transcript(videoid):
"""
Retrieves the transcript for the given YouTube video ID.
"""
transcript = YouTubeTranscriptApi.get_transcript(videoid)
data = ""
for item in transcript:
data = data +" "+ item['text']
return data
```
{% endraw %}
## Step 3: Summarizing the Transcript
We'll use OpenAI's language model to generate a summary of the transcript. Ensure your OpenAI API key is set as an environment variable for security.
{% raw %}
```python
from openai import OpenAI
# Set your OpenAI API key
client = OpenAI(api_key= "your-api-key")
def summary_extraction(transcription):
"""
Retrieves the Summary of YouTube video.
"""
response = client.chat.completions.create(
model="gpt-4",
temperature=0,
messages=[
{
"role": "system",
"content": "You are a highly skilled AI trained in language comprehension and summarization. I would like you to read the following text and summarize it into a concise abstract paragraph. Aim to retain the most important points, providing a coherent and readable summary that could help a person understand the main points of the discussion without needing to read the entire text. Please avoid unnecessary details or tangential points."
},
{
"role": "user",
"content": transcription
}
]
)
return response.choices[0].message.content
Putting It All Together
Let's combine these functions into a cohesive script that takes a YouTube URL and outputs a summary.
def start():
youtube_link = input("Enter the youtube video link: ")
videoid = get_videoId(youtube_link)
if videoid == -1:
print("Please enter correct Youtube video link")
print("*******************************************************")
start()
else:
transcription = create_transcript(videoid)
if len(transcription) >0:
print("*******************************************************")
print("Video Summary")
print("*******************************************************")
print(summary_extraction(transcription))
start()
Running the Application
- Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY='your-api-key-here'
- Run the script:
python app.py
- Input the YouTube URL when prompted.
- Specify the desired language for the summary (default is English).
The application will display a concise summary of the video's content.
Conclusion
You've now built a Python application that extracts and summarizes YouTube video transcripts using AI. This tool can save time and provide quick insights into video content without the need to watch the entire video.
For more details and the complete code, visit the GitHub repository.
Note: Ensure you comply with YouTube's terms of service and OpenAI's usage policies when using this application.
Top comments (0)