DEV Community

Hr MuKi
Hr MuKi

Posted on

I Automated My Entire YouTube Channel with Python — Here's the Architecture

I run multiple YouTube channels that are almost entirely automated using Python. Content generation, editing, thumbnail creation, metadata optimization, scheduling, uploading — all handled by scripts. Here's how the whole system works.

The Problem

Running a YouTube channel manually is a full-time job. Script writing, recording/generating video, editing, creating thumbnails, writing titles/descriptions, tagging, scheduling, uploading, cross-posting to TikTok and Instagram. For ONE video. I needed to publish 3-4 shorts per day across multiple channels.

The Architecture

┌─────────────────┐
│  Content Engine  │ ← Generates scripts, selects footage
├─────────────────┤
│  Video Pipeline  │ ← FFmpeg + moviepy for assembly
├─────────────────┤
│  Thumbnail Gen   │ ← Pillow + custom templates
├─────────────────┤
│  Metadata Layer  │ ← Title/desc/tags optimization
├─────────────────┤
│  Upload Daemon   │ ← Scheduled via LaunchAgents (macOS)
├─────────────────┤
│  Analytics       │ ← Weekly performance reports
└─────────────────┘
Enter fullscreen mode Exit fullscreen mode

Content Engine

Each channel has a "content profile" — a JSON config defining the niche, tone, format, hooks, and source material. The engine takes a topic, pulls relevant data/footage from APIs (Pexels for stock video, ElevenLabs for voiceover), and assembles a script.

Key libraries: requests, openai, json, pathlib

Video Pipeline

This is where most of the engineering went. The pipeline:

  1. Takes the script and splits it into timed segments
  2. Matches each segment to footage (keyword-based search + relevance scoring)
  3. Generates voiceover via ElevenLabs API
  4. Assembles everything with moviepy — footage clips, voiceover, captions, background music
  5. Exports at 1080x1920 (9:16) for Shorts or 1920x1080 for long-form

The caption system was the hardest part. I built a custom word-level subtitle renderer that highlights the current word — similar to what CapCut does but fully programmatic.

class CaptionRenderer:
    def __init__(self, font_path, font_size=48, highlight_color="#FFD700"):
        self.font = ImageFont.truetype(font_path, font_size)
        self.highlight_color = highlight_color

    def render_frame(self, words, current_index, frame_size):
        img = Image.new('RGBA', frame_size, (0, 0, 0, 0))
        draw = ImageDraw.Draw(img)
        # Word-level highlighting with bounce animation
        for i, word in enumerate(words):
            color = self.highlight_color if i == current_index else "#FFFFFF"
            draw.text((x, y), word, font=self.font, fill=color)
        return img
Enter fullscreen mode Exit fullscreen mode

Upload Daemon

macOS LaunchAgents trigger uploads at scheduled times. Each channel has its own plist file with the schedule. The upload script uses YouTube's Data API v3.

The tricky part was handling quota limits. YouTube gives you 10,000 units/day. A video upload costs 1,600 units. So with multiple channels, I had to implement a queue system that respects the daily quota.

class UploadQueue:
    def __init__(self, daily_quota=10000, upload_cost=1600):
        self.remaining = daily_quota
        self.queue = PriorityQueue()  # priority by schedule time

    def can_upload(self):
        return self.remaining >= self.upload_cost
Enter fullscreen mode Exit fullscreen mode

Thumbnail Generation

Each channel has thumbnail templates (Pillow-based). The system:

  1. Selects a key frame from the video
  2. Applies the channel's template (text overlay, borders, color scheme)
  3. Generates 2-3 variants for A/B testing

Results

  • 22 LaunchAgents running daily across all channels
  • ~15-20 videos generated and uploaded per day
  • Average production time per video: 3-8 minutes (compute time, not human time)
  • Human involvement: ~30 min/day reviewing outputs and tweaking configs

Lessons Learned

  1. FFmpeg is king. moviepy is great for prototyping but for production I had to drop down to raw FFmpeg commands for performance.

  2. Rate limiting everything. YouTube API, ElevenLabs, Pexels — they all have rate limits. Build a centralized rate limiter from day one.

  3. Idempotent uploads. Network failures happen. Every upload job has a unique ID and the system checks if it's already been uploaded before retrying.

  4. Content quality > quantity. YouTube's algorithm punishes low-retention content. I scaled back to only publish videos that pass a minimum quality threshold.


I documented the entire architecture, every script template, the scheduling system, and the automation patterns in a toolkit: YouTube Automation Toolkit — $7 if you want to build something similar.

I also have guides for trading bot development, strategy testing, and a full Python trader's bundle.

Happy to answer questions about specific parts of the pipeline!

Top comments (0)