nils44344

Posted on May 23

How I Built a Free, Self-Hosted Pipeline That Auto-Generates Faceless YouTube Shorts

#python #ai #opensource #showdev

Every "AI YouTube" tutorial ends the same way: sign up for ChatGPT Plus, then ElevenLabs, then Pictory, then n8n Cloud. Add it up and you're paying $75–100/month before you've made a single video — let alone a single dollar.

I didn't want a subscription stack. I wanted something that ran on my own machine, used free tiers and local models, and that I actually owned. So I built it, and I just open-sourced it under MIT.

It's called FreeFaceless, and it takes one command to go from nothing to an uploaded Short:

script → voiceover → captions → b-roll → assembled video → YouTube upload

Repo: https://github.com/nils44344/FreeFaceless

Here's how each stage works — and the one bug that cost me an evening.

The orchestration

The whole thing is a linear pipeline. Here's the heart of it (trimmed):

def run_once(publish_at=None, upload_to_youtube=True):
    data = script.generate()                          # 1. Groq writes the script
    voice_mp3 = voice.synth(data["full_text"], ...)   # 2. edge-tts voiceover
    words = captions.transcribe_words(voice_mp3)      # 3. local Whisper timing
    scenes = visuals.fetch_for_scenes(data["scenes"]) # 4. Pexels b-roll
    ass = captions.write_ass(words, ...)              # 5. caption file
    final = assemble.build(scenes, voice_mp3, ass, …) # 6. ffmpeg
    if upload_to_youtube:
        upload.upload_video(final, data["title"], …)  # 7. YouTube Data API

Every stage is its own module, and everything is driven by a single config.yaml — so changing the niche, voice, or caption style is an edit, not a code change.

1. Script generation — Groq (free tier)

Groq's free tier serves Llama 3.3 70B fast, and it's OpenAI-compatible, so the official openai SDK works by just pointing the base URL at Groq:

from openai import OpenAI
client = OpenAI(api_key=GROQ_API_KEY, base_url="https://api.groq.com/openai/v1")

resp = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    response_format={"type": "json_object"},  # forces clean JSON
    messages=[{"role": "system", "content": SYSTEM_PROMPT}, ...],
)

The prompt asks for a hook + 4–6 facts + a CTA, returned as JSON with per-scene visual_query strings I can feed straight to stock search. JSON mode means no fragile regex parsing.

2. Voiceover — edge-tts (free, no key)

edge-tts exposes Microsoft's neural voices for free, no API key:

import edge_tts
communicate = edge_tts.Communicate(text, "en-US-ChristopherNeural", rate="-12%")
await communicate.save("voice.mp3")

The quality is genuinely good enough for faceless content, and there are dozens of voices/accents to match the niche.

3. Word-level captions — faster-whisper (local)

This is the part most paid tools charge per-minute for. faster-whisper runs locally on CPU and gives word-level timestamps, which I turn into karaoke-style captions:

from faster_whisper import WhisperModel
model = WhisperModel("base", device="cpu", compute_type="int8")
segments, _ = model.transcribe("voice.mp3", word_timestamps=True)

Then I write an ASS subtitle file, 3 words at a time, in a big bold style — the look every Shorts channel uses. (FreeFaceless ships the open-licensed Anton font so it works out of the box.)

4. B-roll — Pexels (free API)

Each scene's visual_query becomes a Pexels Videos search, pulling vertical clips. Free API, generous limits.

5. Assembly — ffmpeg

ffmpeg crops every clip to 1080×1920, concatenates them to match the voiceover length, overlays the audio, and burns in the captions:

"-vf", f"subtitles='{ass_path}':fontsdir='{fonts_dir}'"

6. Upload — YouTube Data API

OAuth desktop flow, token cached after the first browser login, then every future run refreshes silently. Supports immediate or scheduled publishing.

The bug that cost me an evening: SSL on Windows

On my machine, every HTTPS call died with CERTIFICATE_VERIFY_FAILED. The culprit: antivirus doing TLS interception with a custom root cert that Python's bundled certifi doesn't know about. The fix is one import, before any network client is built:

import truststore
truststore.inject_into_ssl()  # use the OS cert store instead of certifi

If you build anything network-heavy on Windows, keep this in your back pocket.

Honest limitations

Free tiers are rate-limited. This is built for one channel on a normal schedule, not bulk farms. Push it hard and you'll hit limits.
Windows-first. The Python core runs anywhere; the helper scripts are PowerShell. Cross-platform PRs very welcome.
It's a production tool, not a money machine. It automates making videos. Views and revenue depend on your content and the algorithm — no tool changes that.

Try it / contribute

The repo has a full setup guide (including the Google OAuth walkthrough, which is the only fiddly part):

https://github.com/nils44344/FreeFaceless

If it's useful, a star helps other people find it — and I'd genuinely love feedback, especially on making the setup smoother for non-developers and getting it running on macOS/Linux.

DEV Community