David

Posted on Oct 2, 2023 • Updated on Nov 23, 2023

Video Content Aggregator - MoviePy 🎬

#python #api #youtube #ai

Starters 🏁

Welcome back to another installment of the "Project" series, where I delve into the technical components of the projects I've undertaken. I apologize for the delays in my posts; the past few weeks have been hectic with school, co-op applications, and just life (haha).
This week, I'll discuss a personal project I made using MoviePy, the YouTube API, and the OpenAI API. If you're unfamiliar with me, please refer to my introductory post. 👇

Pilot - Start of my blog journey 🛫

David ・ Sep 7

#computerscience #programming #pilot #career

Project Overview (Video Content Aggregator) 🙆

As a personal project, I worked on a tool called the "Video Content Aggregator". This tool, combining the YouTube API, OpenAI API, and MoviePy, aids in the creation of dynamic video content in the form of YouTube Shorts. I leaned on a mix of contemporary APIs and MoviePy to ensure the tool's functionality and reliability. Here's a quick overview of the technologies used:

APIs: YouTube, OpenAI, Pexels
Language: Python
Video Editing: MoviePy

Code review 🔍

Requirements ✔️

The code utilizes several libraries including moviepy, google-api-python-client, openai, gtts, etc. You should ensure that you have these libraries installed, along with their dependencies.
Ensure you have the right API keys and the environment variables are set correctly.
Ensure you have ffmpeg installed as it's a prerequisite for moviepy.
You can check the entire list of requirements at requirements.txt.

OpenAI API 🤖

The code interacts with the OpenAI API to generate a fact, a title based on the fact, and a main subject noun based on the fact.
These are three distinct interactions with the API: Fact, Title, and Subject noun, which are important components in fetching video data, aggregation, generation, and upload to YouTube.

def generate_fact():
    prompt = "Give me around 75 words based on an interesting fact."
    response = openai.Completion.create(
        engine="text-davinci-003", prompt=prompt, max_tokens=200)
    return response.choices[0].text.strip()

def generate_title(fact):
    prompt = f"Based on the generated fact, {fact}, return a short title for the video."
    response = openai.Completion.create(
        engine="text-davinci-003", prompt=prompt, max_tokens=30)
    video_title = response.choices[0].text.strip()
    return video_title

def generate_subject_noun(fact):
    prompt = f"Based on the generated fact, {fact}, return a main subject noun."
    response = openai.Completion.create(
        engine="text-davinci-003", prompt=prompt, max_tokens=30)
    return response.choices[0].text.strip()

Fetch Video API Calls 🎥

The code fetches videos from two sources: Pexels and YouTube.
Each fetch function returns up to three videos based on a keyword derived from the generated fact. The keyword is sourced from the generate_subject_noun(fact).

def test_fetch_pexels_videos():
    videos = fetch_pexels_videos("earth")
    assert videos, "Failed to fetch videos from Pexels"

def test_fetch_youtube_videos():
    videos = fetch_youtube_videos("earth")
    assert videos, "Failed to fetch videos from YouTube"

Aggregate and Generate Videos 🎬

The code processes the fetched videos and compiles them, adding subtitles and adjusting their lengths. The videos are resized and cropped.
Audio from Google TTS is used as the background for the video.

def get_tts_audio_clip(text):
    tts = gTTS(text, lang='en', tld='com.au', slow=False)
    audio_bytes = BytesIO()
    tts.write_to_fp(audio_bytes)
    audio_bytes.seek(0)
    temp_audio_filename = os.path.join(OUTPUT_FOLDER, "temp_audio.mp3")
    with open(temp_audio_filename, "wb") as f:
        f.write(audio_bytes.read())
    audio_clip = AudioFileClip(temp_audio_filename)
    return audio_clip

def process_youtube_videos(youtube_videos, audio_clip_duration):
    video_clips = []
    target_duration = audio_clip_duration / 3

    for video_info in youtube_videos:
        video_id = video_info.get('id', {}).get('videoId')
        if video_id:
            youtube_video_url = f"https://www.youtube.com/watch?v={video_id}"
            yt = YouTube(youtube_video_url)
            video_stream = yt.streams.filter(
                progressive=True, file_extension="mp4").order_by("resolution").desc().first()
            video_filename = os.path.join(
                OUTPUT_FOLDER, f"youtube_video_{video_id}.mp4")
            video_stream.download(output_path=OUTPUT_FOLDER,
                                  filename=os.path.basename(video_filename))
            video_clip = VideoFileClip(video_filename).subclip(5)

            if video_clip.duration < target_duration:
                loop_count = int(target_duration // video_clip.duration) + 1
                video_clip = concatenate_videoclips([video_clip] * loop_count)

            video_clip = video_clip.set_duration(target_duration)
            video_clips.append(video_clip)
    return video_clips

def generate_subtitles(fact, final_video_duration):
    fact_parts = textwrap.wrap(fact, width=40)
    subs = []
    interval_duration = 2.95
    start_time = 0
    for part in fact_parts:
        end_time = min(start_time + interval_duration, final_video_duration)
        subs.append(((start_time, end_time), part))
        start_time = end_time
    return subs


def annotate_video_with_subtitles(video, subtitles):
    def annotate(clip, txt, txt_color="white", fontsize=50, font="Xolonium-Bold"):
        txtclip = TextClip(txt, fontsize=fontsize, color=txt_color,
                           font=font, bg_color="black").set_duration(clip.duration)
        txtclip = txtclip.set_position(
            ("center", "center")).set_duration(clip.duration)
        cvc = CompositeVideoClip([clip, txtclip])
        return cvc

    annotated_clips = [annotate(video.subclip(from_t, min(
        to_t, video.duration)), txt) for (from_t, to_t), txt in subtitles]
    return concatenate_videoclips(annotated_clips)

Upload to YouTube 🍿

The code uses YouTube's v3 API to authenticate and upload videos.
The video is uploaded with a title derived from the filename and a pre-defined description.

def authenticate_youtube():
    os.environ["OAUTHLIB_INSECURE_TRANSPORT"] = "1"

    flow = google_auth_oauthlib.flow.InstalledAppFlow.from_client_secrets_file(
        CLIENT_SECRETS_FILE, OAUTH_SCOPE)
    credentials = flow.run_local_server(port=0)

    youtube = googleapiclient.discovery.build(
        API_SERVICE_NAME, API_VERSION, credentials=credentials)
    return youtube


def upload_video_to_youtube(youtube, file_path, title, description):
    request = youtube.videos().insert(
        part="snippet,status",
        body={
            "snippet": {
                "categoryId": "22",
                "description": description,
                "title": title
            },
            "status": {
                "privacyStatus": "private",
                "selfDeclaredMadeForKids": False,
                "publishAt": "2023-08-24T00:00:00.0Z"
            }
        },
        media_body=MediaFileUpload(
            file_path, mimetype='video/mp4', resumable=True)
    )
    return request.execute()

Considerations 🧐

I should prioritize restructuring the current linear code to be more modular, breaking tasks down into smaller functions. Handling potential errors is essential, especially in cases where video fetching might yield no results or if there's an unexpected response from the OpenAI API. It's also crucial to manage rate limits and any potential hiccups when interfacing with third-party APIs. Given the concurrent nature of the video processing, I must ensure all resources, like video clips, are appropriately closed after use. Lastly, implementing error handling mechanisms, particularly try-except blocks, in the critical sections will be essential to maintain the stability of the program.

Conclusion ⛳️

Thank you for taking the time to read and I would appreciate any suggestions to improve the quality of my blog. Stay tuned for future posts. 🤠
Check out the YouTube channel!

Check out the project embedded repository below! 👇

gdcho / vc_aggregator

Generate engaging video content utilizing the YouTube API, OpenAI API, and MoviePy. Harness the power of AI and Python automation to craft dynamic YouTube Shorts 🎬

Table of Contents