Ben Utting

Posted on May 5 • Originally published at ctrlaltautomate.com

I Built a YouTube Content Pipeline That Turns Trend Signals Into Ready-to-Film Scripts

#ai #automation #youtube

A client came to me on Upwork with a problem I'd seen before. He runs multiple YouTube channels across different niches and was spending 10 to 15 hours a week just on research: scrolling Reddit for trending topics, checking competitor uploads, scanning Google Trends, then manually writing scripts and tracking everything in a spreadsheet.

His publishing cadence had stalled. Not because he couldn't film or edit. Because the research and scripting bottleneck ate the time he needed for production. He wanted a system that would handle the entire front half of his content workflow: find what's trending, filter out ideas he'd already covered, score them for potential, write first-draft scripts, and sync everything to a Notion content calendar. Automatically, on a schedule.

That's what I built for him. Here's how it works.

What the client was doing before

Every other day, he'd open Reddit in one tab, YouTube trending in another, and Google Trends in a third. For each of his niches, he'd manually scan 3 to 5 subreddits, check what his competitors had uploaded in the last 48 hours, and look for keyword spikes. He'd copy anything promising into a spreadsheet, then decide which ideas were worth scripting.

The scripting was another few hours. Write the hook, outline the sections, draft the full script, write a YouTube description, come up with thumbnail text. Then check whether he'd already covered something similar three months ago (he often had, and wouldn't realise until halfway through the script).

The whole process worked, but it didn't scale. Adding a new niche meant doubling the research time. And the quality was inconsistent because some days he'd rush the research to get to filming.

The system I delivered

Seven n8n sub-workflows wired into a master pipeline. Runs on a cron schedule (default: every two days at 8 AM) or manually with one click. The client configures his niches in a single JSON file: subreddits, competitor channel IDs, keywords, scoring preferences. Everything else is automated.

Trigger --> Research --> Dedup --> Scoring --> SEO --> Script --> QA --> Notion Sync

Four AI agents. One Notion workspace. About $1 per run in API costs.

Research agent

The first stage pulls signals from three sources in parallel:

Reddit: For each subreddit in the niche config, it fetches the top 25 hot posts from the past week via the public JSON API. No authentication needed. For a niche tracking three subreddits, that's 75 posts scanned per run.

YouTube Data API v3: Two types of calls. First, a keyword search for trending videos in the last 48 hours. Second, the 5 most recent uploads from each competitor channel. This catches both breakout trends and direct competitor moves.

Google Trends (via SerpAPI): Interest-over-time data for each keyword, with trend direction calculated: rising, stable, or declining. A keyword spiking 200% in the last week is a signal worth surfacing.

All three branches merge into a single normalised array. Every signal has the same shape: source, title, metric value, URL, timestamp, niche.

Deduplication layer

This was the stage the client cared about most. Before the system scores anything, it queries his Notion ideas database for everything he's published or drafted in the last 90 days. GPT-4o-mini compares new signal titles against past titles for semantic similarity, not just exact matches. "Top 10 OSINT Tools 2025" and "Best OSINT Tools This Year" get correctly flagged as the same idea.

Cost: about $0.02 per run. It's the cheapest part of the pipeline and solves the problem he kept hitting manually: starting a script only to realise he'd already covered the same angle.

Scoring engine

GPT-4o scores each surviving signal on three dimensions, 0 to 100:

Virality: Shareability, emotional hook, controversy, broad appeal.
Search potential: Evergreen search volume, keyword clarity, competition level.
Niche relevance: How core the topic is to the existing subscriber base.

The scoring prompt includes a calibrated rubric so the model produces consistent scores across runs. A weighted formula combines the three: virality at 0.4, search potential at 0.35, niche relevance at 0.25. The client can adjust these weights depending on whether he's optimising for growth or SEO on a particular channel.

Ideas below his threshold (set to 65) get dropped. The top 5 per niche advance.

SEO enrichment

For each surviving idea, SerpAPI searches Google to pull competitive landscape data: what's already ranking, related searches, total results. GPT-4o uses that context to refine the title for search, generate primary and secondary keywords, YouTube tags, and thumbnail text suggestions.

The title refinement is where the value shows up. The model sees what's currently ranking and suggests a title that targets the same keyword but differentiates from existing content.

Script writer

GPT-4o generates a full structured script for each idea:

Intro (100 to 150 words): Hook and context. The prompt explicitly blocks "In this video" openers and filler phrases.
Sections (4 to 8 for long-form): Each with a heading that doubles as a YouTube chapter marker.
Outro (50 to 100 words): Specific call to action, not a generic "like and subscribe."
Description block: YouTube description with SEO keywords and timestamp placeholders.
Thumbnail text: Three short options for text overlays.

The prompt includes a channel_voice parameter the client configures per niche so the scripts match each channel's tone. Word count and estimated runtime get calculated automatically.

QA agent

A second GPT-4o call reviews every script against a rubric: hook strength (30%), pacing and pattern interrupts (30%), SEO keyword density (25%), and whether a specific CTA is present (15%). If the QA score drops below 70, the script gets one automatic revision pass.

The QA agent also flags factual claims that need manual verification before filming. Things like "90% of breaches start with phishing" appearing without a source.

Notion sync

The final stage creates three linked pages in the client's Notion workspace:

Ideas database: Title, niche, score, hook, format, source signals, status.
Scripts database: Full script text, word count, estimated runtime, QA score, linked to the idea.
Calendar database: For long-form ideas, a publish date calculated from the most recent calendar entry plus a configurable gap (the client uses 3 days).

A Slack notification fires a summary after each run: how many ideas were generated, how many scripts are ready for review.

What it costs to run

For a typical run across 2 niches, producing about 10 scored ideas and 10 scripts:

Service	Cost
GPT-4o (scoring, scripting, QA)	$0.80 to $1.20
GPT-4o-mini (dedup)	$0.02
SerpAPI (10 keyword lookups)	Free tier
YouTube Data API	Free tier
Reddit JSON API	Free
Notion API	Free
Total	~$1 per run

At every-other-day cadence, that's about $15 per month. The client was spending 10 to 15 hours a week on the manual version of this. Even at minimum wage, the ROI is obvious.

What the client does now

He opens Notion every other morning and reviews 5 to 10 scored ideas with first-draft scripts already written. He picks the ones he wants to film, tweaks the scripts if needed, and goes straight to production. Research and scripting went from 10 to 15 hours a week to about 30 minutes of reviewing and editing what the pipeline produced.

His publishing frequency went up. His consistency went up. And he stopped accidentally remaking videos he'd already published.

The build details

The whole system runs on self-hosted n8n with a Postgres backend (Docker Compose included). Each of the seven stages is a standalone sub-workflow that can be imported, tested, and modified independently. The prompts are separate Markdown files, not hardcoded in the workflow JSON, so the client can tune them without touching the automation logic.

Config is a single JSON file per niche: subreddits, competitor channels, keywords, scoring weights, thresholds. Adding a new niche takes about 2 minutes.

I've open-sourced the full pipeline on GitHub: youtube-content-pipeline. If you're running a content operation and doing the research and scripting manually, it's there to use.

Over to you

If you're publishing content regularly on YouTube, what's eating most of your time: finding ideas, writing scripts, or keeping track of what you've already covered? I'm curious whether the research bottleneck is as common as it looks from the Upwork jobs I keep seeing.

ctrlaltautomate.com

DEV Community