Karan Kulshestha

Posted on Oct 19, 2022

Using GPT-3 and Whisper to generate a Summary of a YouTube video of any language 😀

#openai #streamlit #gpt3 #whisperai

In this post, you'll learn about how to use whisper and GPT-3 to generate a short summary of YouTube videos in any language. you can see demo video here

Technologies Required

Streamlit (Building Webapp)
Whisper (OpenAI speech recognition model)
OpenAI GPT-3 API (API Key for using this service)
Python

Setup Required

Install Whisper and Streamlit using these command

pip install git+https://github.com/openai/whisper.git 
pip install streamlit

Install FFMPEG

# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg

# on Arch Linux
sudo pacman -S ffmpeg

# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg

# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg

# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg

Start Writing Code

Start Importing Packages in Python file

import streamlit as st;
import openai
from pytube import YouTube 
import whisper

Setup OpenAI API service

openai.organization = ""
openai.api_key = 'sk-*****kpeKyzuPRIT3Bl***************' # replace with your own key

Building UI of a WebApp using Streamlit

with st.container():
    st.header("Youtube Summary")
    st.title("Get the summary of any YouTube video in any language")

Taking the URL of a video and transcribing it using Whisper

yt = YouTube(text_input)              ## pass the input url 
yt.streams.filter(file_extension='mp3')
stream = yt.streams.get_by_itag(139)
stream.download('',"audio.mp3")            ## download the audio 
model = whisper.load_model("base")         ## load whisper model
result = model.transcribe("audio.mp3")     ## start transcribing
content = result["text"]                   ## store text

Generate the Summary of the transcription using OpenAI API

response = openai.Completion.create(engine="text-davinci-002",prompt=content + tldr_tag,temperature=0.3,
max_tokens=200,
top_p=1.0,          ## calling API to get Summary using GPT engine 
frequency_penalty=0,
presence_penalty=0,)

Finally Display the Results

st.subheader("Here is your summary!")
st.write(response["choices"][0]["text"])   ## finally inject result to webapp using streamlit

Complete Source Code

import streamlit as st;
import openai
from pytube import YouTube 
import whisper

openai.organization = ""
openai.api_key = 'sk-yjfA0s****************1zOhM****lXM'

with st.container():
    st.header("Youtube Summary")
    st.title("Get the summary of any YouTube video in any language")


## input url of video ##

with st.container():
    st.write("---")
    text_input = st.text_input(
        "Please paste the url of the video 👇",
        placeholder="paste the url",                 # taking url of a YT video
    )

    if text_input:
        try: 
            with st.spinner('Wait for it...'):   ## streamlit loader
                tldr_tag = "\n\nTl;dr"         ## tag use to tell GPT engine where text is ended
                yt = YouTube(text_input)              ## pass url as text_input to pytube for for downloading the audio
                yt.streams.filter(file_extension='mp3')
                stream = yt.streams.get_by_itag(139)
                stream.download('',"audio.mp3")            ## download the audio and saved as audio.mp3 in same folder
                model = whisper.load_model("base")         ## load whisper model
                result = model.transcribe("audio.mp3")     ## start transcribing video into text
                content = result["text"]                   ## store text om content var
                st.write(content)
                response = openai.Completion.create(engine="text-davinci-002",prompt=content + tldr_tag,temperature=0.3,
                max_tokens=200,
                top_p=1.0,                                 ## calling API to generate the summary of transcribed text stored in content var
                frequency_penalty=0,
                presence_penalty=0,
            )
                st.subheader("Here is your summary!")
                st.write(response["choices"][0]["text"])   ## finally inject responsed text into webapp using streamlit function 
            st.success('Done!')
        except: 
            print("Connection Error")

Hope you like it and Give me feedback please

My GitHub : link
You can connect with me here karankulx@gmail.com

DEV Community

Using GPT-3 and Whisper to generate a Summary of a YouTube video of any language 😀

Technologies Required

Setup Required

Start Writing Code

Complete Source Code

Hope you like it and Give me feedback please

Top comments (0)

Read next

Whisper Speech Recognition Model Achieves Reliable Self-Confidence Scoring Without Extra Training

The Crucial Role of Funding in Open Source Development

OOP in Python

Daily JavaScript Challenge #JS-108: Calculate Factorial with Tail Recursion