Building a System That Automates YouTube Post-Production

#webdev #ai #startup #youtube

Most creator tools today focus on analytics and suggestions.

We wanted to explore something different:

What if YouTube post-production could actually be automated?

We’ve been building Growati, a system that helps creators generate personalised:

titles
descriptions
thumbnails
chapters

for YouTube videos.

One of the most difficult engineering problems was properly handling video understanding.

Instead of treating videos like plain text, our system analyses videos frame-by-frame to extract:

key moments
best frames
visual context

The goal is to make metadata generation more contextual instead of generic AI text generation.

Another challenge was orchestration.

A single automation flow may involve:

frame extraction
metadata generation
thumbnail generation
scoring
updates

We ended up using a queue-based architecture with BullMQ to handle automation reliably.

One thing we learned quickly while talking to creators:

Many educational creators and podcasters are overwhelmed by post-production work after uploading.

Most of them are not looking for more dashboards.

They want less manual work.

That insight changed how we approached the product.

Right now, Growati is still early:

10+ beta users
500+ synced videos
50+ metadata updates processed

We’re launching on Product Hunt on 28 May.

Top comments (1)

Harjot Singh • May 31

Post-production is the perfect automation target because it's the repetitive, time-eating tail of every video - cut silences, generate captions, chapter markers, thumbnails, descriptions, reformat for shorts - work that's mechanical enough to automate but currently done by hand every single time. A pipeline that chains those steps is real leverage for a solo creator, and the AI-assisted parts (transcription -> captions/chapters/description) are exactly where it shines because they're high-volume and roughly-right-is-fine. The win is the assembly line, not any single step.

The one design note from building pipelines like this: keep a human-approval gate on the public-facing outputs (title, description, thumbnail) even if everything else is fully automated, because those are the parts where a confidently-wrong AI output costs you views or credibility, and they're cheap to glance at. Automate the tedious 90%, gate the 10% that's customer-facing. That propose-then-approve split is core to how I build Moonshift, the thing I work on - a multi-agent pipeline that takes a prompt to a deployed SaaS, where agents do the work and a gate sits before anything that matters ships. Multi-model routing keeps a build ~$3 flat, first run free no card. Genuinely useful build. Is it fully hands-off, or do you review the AI-generated titles/descriptions before publish? That's the line I'd keep a human on.