In this post, you'll learn about how to use whisper and GPT-3 to generate a short summary of YouTube videos in any language. you can see demo video here
Technologies Required
Streamlit (Building Webapp)
Whisper (OpenAI speech recognition model)
OpenAI GPT-3 API (API Key for using this service)
Python
Setup Required
- Install Whisper and Streamlit using these command
pip install git+https://github.com/openai/whisper.git
pip install streamlit
- Install FFMPEG
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpeg
Start Writing Code
- Start Importing Packages in Python file
import streamlit as st;
import openai
from pytube import YouTube
import whisper
- Setup OpenAI API service
openai.organization = ""
openai.api_key = 'sk-*****kpeKyzuPRIT3Bl***************' # replace with your own key
- Building UI of a WebApp using Streamlit
with st.container():
st.header("Youtube Summary")
st.title("Get the summary of any YouTube video in any language")
- Taking the URL of a video and transcribing it using Whisper
yt = YouTube(text_input) ## pass the input url
yt.streams.filter(file_extension='mp3')
stream = yt.streams.get_by_itag(139)
stream.download('',"audio.mp3") ## download the audio
model = whisper.load_model("base") ## load whisper model
result = model.transcribe("audio.mp3") ## start transcribing
content = result["text"] ## store text
- Generate the Summary of the transcription using OpenAI API
response = openai.Completion.create(engine="text-davinci-002",prompt=content + tldr_tag,temperature=0.3,
max_tokens=200,
top_p=1.0, ## calling API to get Summary using GPT engine
frequency_penalty=0,
presence_penalty=0,)
- Finally Display the Results
st.subheader("Here is your summary!")
st.write(response["choices"][0]["text"]) ## finally inject result to webapp using streamlit
Complete Source Code
import streamlit as st;
import openai
from pytube import YouTube
import whisper
openai.organization = ""
openai.api_key = 'sk-yjfA0s****************1zOhM****lXM'
with st.container():
st.header("Youtube Summary")
st.title("Get the summary of any YouTube video in any language")
## input url of video ##
with st.container():
st.write("---")
text_input = st.text_input(
"Please paste the url of the video 👇",
placeholder="paste the url", # taking url of a YT video
)
if text_input:
try:
with st.spinner('Wait for it...'): ## streamlit loader
tldr_tag = "\n\nTl;dr" ## tag use to tell GPT engine where text is ended
yt = YouTube(text_input) ## pass url as text_input to pytube for for downloading the audio
yt.streams.filter(file_extension='mp3')
stream = yt.streams.get_by_itag(139)
stream.download('',"audio.mp3") ## download the audio and saved as audio.mp3 in same folder
model = whisper.load_model("base") ## load whisper model
result = model.transcribe("audio.mp3") ## start transcribing video into text
content = result["text"] ## store text om content var
st.write(content)
response = openai.Completion.create(engine="text-davinci-002",prompt=content + tldr_tag,temperature=0.3,
max_tokens=200,
top_p=1.0, ## calling API to generate the summary of transcribed text stored in content var
frequency_penalty=0,
presence_penalty=0,
)
st.subheader("Here is your summary!")
st.write(response["choices"][0]["text"]) ## finally inject responsed text into webapp using streamlit function
st.success('Done!')
except:
print("Connection Error")
Hope you like it and Give me feedback please
My GitHub : link
You can connect with me here karankulx@gmail.com
Top comments (0)