RenderIO

Posted on Apr 6 • Originally published at renderio.dev

Build an AI UGC Video Processing Pipeline

#ffmpeg #api #video #webdev

The real bottleneck in AI UGC video production

AI-generated UGC for ads and social media has moved past the "can we do this" phase. Tools like HeyGen, Synthesia, and D-ID produce convincing avatar videos. The generation part works. Everything after generation is where teams get stuck.

You generate a video. Then you need to post-process it so it doesn't scream "AI." Then you need variations for A/B testing across ad sets. Then each variation needs reformatting for different platforms. One base video can turn into 50-100 output files. Without a pipeline, each one is manual work in Premiere or CapCut.

This guide walks through building that pipeline with FFmpeg and the RenderIO API, from raw AI output to platform-ready content.

How the AI UGC video processing pipeline works

AI Generation → Post-Processing → Variation → Platform Formatting → Distribution
  (HeyGen)       (RenderIO)     (RenderIO)     (RenderIO)         (n8n/Zapier)

Each stage is a separate API call. Each call runs independently. The entire pipeline from generation to distribution takes under 10 minutes for 50+ output files.

Stage 1: choose your AI generation tool

Pick the tool that matches what you're building:

HeyGen works best for talking-head UGC with custom avatars. If you're creating product demos or testimonial-style content, this is probably where you start. Their avatar quality has gotten noticeably better since late 2025. See our guide on converting HeyGen output to Instagram Reels for the full post-processing workflow.

Synthesia is more corporate. Training videos, internal comms, that sort of thing. The avatars feel professional but not "social media native."

D-ID turns a single photo into a talking video. Useful when you don't have studio footage. Less realistic than HeyGen but faster to set up.

Runway combined with a voice-over tool works for creative or lifestyle UGC where you want more visual flexibility than a talking head.

Output from any of these: one raw MP4 file, typically 16:9, 30-60 seconds.

Stage 2: post-processing raw AI video

Raw AI video has tells. Metadata flags it as AI-generated. Audio levels are inconsistent. The video looks "too clean" compared to native social content. Post-processing fixes all of that in one API call per base video.

Here's why each step matters:

map_metadata -1 strips generation metadata that platforms can detect
nlmeans + noise adds natural film grain (AI video is unnaturally clean)
eq shifts color just enough to break perceptual fingerprints
loudnorm normalizes audio to -14 LUFS (what TikTok and Reels expect)

curl -X POST https://renderio.dev/api/v1/run-ffmpeg-command \
  -H "X-API-KEY: your_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "ffmpeg_command": "-i {{in_video}} -map_metadata -1 -vf \"nlmeans=s=6:p=3:r=9,noise=alls=12:allf=t,eq=brightness=0.01:contrast=1.03:saturation=0.97\" -af \"loudnorm=I=-14:TP=-2:LRA=7\" -c:v libx264 -crf 18 -c:a aac -b:a 128k {{out_video}}",
    "input_files": { "in_video": "https://example.com/heygen-raw.mp4" },
    "output_files": { "out_video": "base-processed.mp4" }
  }'

The output is a clean base for variation creation. For more on making AI video look natural, we have a dedicated guide.

Troubleshooting post-processing

A few things that trip people up:

Grain looks blocky on short videos (under 15 seconds). Lower the noise value from alls=12 to alls=6. Short clips get compressed harder by platforms, and heavy grain turns into blocky artifacts after re-encoding.

Audio sounds distorted after loudnorm. This usually happens when the source audio is already very loud (above -8 LUFS). Add a limiter before loudnorm: -af "alimiter=limit=0.9,loudnorm=I=-14:TP=-2:LRA=7".

HeyGen output has variable frame rate. Force constant frame rate early in the chain by adding -r 30 before the output filename. Variable frame rate causes sync issues in some platform players.

Stage 3: creating AI UGC video variations

One base video becomes 10-20 unique variations. Each variation uses different FFmpeg parameters so every output has a distinct fingerprint. This matters for ad testing (different creatives per ad set) and for posting across accounts without duplicate detection.

Color grade variations

const colorVariations = [
  { name: "warm", filter: "colortemperature=temperature=6500" },
  { name: "cool", filter: "colortemperature=temperature=4500" },
  { name: "vivid", filter: "eq=saturation=1.3:contrast=1.1" },
  { name: "muted", filter: "eq=saturation=0.7:contrast=0.95" },
  { name: "vintage", filter: "eq=saturation=0.8:contrast=1.1,colorbalance=rs=0.05:gs=-0.02:bs=-0.05" },
];

Speed variations

const speedVariations = [
  { name: "normal", filter: "setpts=1.0*PTS", afilter: "atempo=1.0" },
  { name: "fast", filter: "setpts=0.9*PTS", afilter: "atempo=1.11" },
  { name: "slow", filter: "setpts=1.1*PTS", afilter: "atempo=0.91" },
];

Crop variations

Different crop positions change the video's perceptual hash, which helps if you're posting variations across multiple accounts. See our guide on batch processing AI videos for social media for platform-specific crop strategies.

const cropVariations = [
  { name: "center", filter: "crop=ih*9/16:ih:(iw-ih*9/16)/2:0" },
  { name: "left", filter: "crop=ih*9/16:ih:iw*0.1:0" },
  { name: "right", filter: "crop=ih*9/16:ih:iw*0.5:0" },
  { name: "tight", filter: "crop=iw*0.6:ih*0.6:iw*0.2:ih*0.1,scale=1080:1920" },
];

Combined variation generator

async function createVariations(baseVideoUrl) {
  const variations = [];

  for (const color of colorVariations) {
    for (const speed of speedVariations) {
      const name = `${color.name}-${speed.name}`;
      const vf = `${speed.filter},${color.filter}`;
      const af = speed.afilter;

      variations.push({
        name,
        command: `-i {{in_video}} -vf "${vf}" -af "${af}" -c:v libx264 -crf 22 -c:a aac -b:a 128k {{out_video}}`,
      });
    }
  }

  // 5 colors x 3 speeds = 15 variations
  const jobs = variations.map(v =>
    fetch("https://renderio.dev/api/v1/run-ffmpeg-command", {
      method: "POST",
      headers: {
        "X-API-KEY": process.env.RENDERIO_API_KEY,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        ffmpeg_command: v.command,
        input_files: { in_video: baseVideoUrl },
        output_files: { out_video: `${v.name}.mp4` },
      }),
    }).then(r => r.json())
  );

  return Promise.all(jobs);
}

15 variations, all processing in parallel. Total time: same as processing one video.

Stage 4: platform formatting for AI UGC videos

Each variation needs platform-specific formatting. This multiplies your output count. For the full breakdown of specs per platform, see batch processing AI videos for social media.

const platformConfigs = {
  tiktok: {
    command: `-i {{in_video}} -filter_complex "[0:v]scale=1080:1920:force_original_aspect_ratio=increase,crop=1080:1920[v]" -map "[v]" -map 0:a -c:v libx264 -crf 22 -c:a aac -movflags +faststart {{out_video}}`,
  },
  reels: {
    command: `-i {{in_video}} -t 90 -filter_complex "[0:v]scale=1080:1920:force_original_aspect_ratio=increase,crop=1080:1920[v]" -map "[v]" -map 0:a -c:v libx264 -crf 22 -c:a aac -movflags +faststart {{out_video}}`,
  },
  shorts: {
    command: `-i {{in_video}} -t 60 -filter_complex "[0:v]scale=1080:1920:force_original_aspect_ratio=increase,crop=1080:1920[v]" -map "[v]" -map 0:a -c:v libx264 -crf 20 -c:a aac -movflags +faststart {{out_video}}`,
  },
  linkedin: {
    command: `-i {{in_video}} -vf "scale=1920:1080:force_original_aspect_ratio=decrease,pad=1920:1080:(ow-iw)/2:(oh-ih)/2" -af "loudnorm=I=-16" -c:v libx264 -crf 20 -c:a aac -movflags +faststart {{out_video}}`,
  },
};

async function formatForPlatforms(variationUrl, variationName) {
  const jobs = Object.entries(platformConfigs).map(([platform, config]) =>
    fetch("https://renderio.dev/api/v1/run-ffmpeg-command", {
      method: "POST",
      headers: {
        "X-API-KEY": process.env.RENDERIO_API_KEY,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        ffmpeg_command: config.command,
        input_files: { in_video: variationUrl },
        output_files: { out_video: `${variationName}-${platform}.mp4` },
      }),
    }).then(r => r.json())
  );

  return Promise.all(jobs);
}

15 variations x 4 platforms = 60 platform-ready videos. All from one AI generation.

Stage 5: distribution with n8n

Wire it all together with n8n (or Zapier). Check the n8n video processing guide for setup details.

Webhook trigger receives HeyGen export URL
HTTP Request sends POST to RenderIO for post-processing
Wait/Poll checks command status until complete
Loop iterates each variation config, sends POST to RenderIO
Wait/Poll checks all variation commands
Loop iterates each platform, sends POST to RenderIO
Wait/Poll checks all platform commands
Upload sends results to respective platform APIs or scheduling tools

The entire pipeline runs automatically. You input one HeyGen URL and get 60 platform-ready videos.

Cost analysis

Stage	API calls per base video	Notes
Post-processing	1	One-time cleanup
Variations	15	5 colors x 3 speeds
Platform formatting	60	15 variations x 4 platforms
Total	76	Per base video

On RenderIO's Growth plan at $29/month (1,000 commands), you can process about 13 base videos per month through the full pipeline. For higher volumes, the Business plan at $99/month (20,000 commands) handles 263 base videos per month.

Cost per output video (Business): ~$0.005
Cost per base video on Business (76 outputs): ~$0.38

Here's how it compares:

Method	Cost per base video	Your time
Manual processing in Premiere	$100-150 (at $50/hr)	2-3 hours
Adobe Premiere batch export	~$25 of time	30 min
RenderIO pipeline (Business)	$0.38	0 min

Getting started

Start with a simpler pipeline and expand:

Week 1: Post-processing only (1 API call per video)
Week 2: Add 3 color variations (4 API calls per video)
Week 3: Add platform formatting (16 API calls per video)
Week 4: Add speed variations and full automation (76 API calls per video)

The Starter plan ($9/month, 500 commands) covers week 1-2 for most teams. Scale to Growth or Business as your volume increases. You can also compress video with FFmpeg to reduce storage costs before uploading.

FAQ

How long does the full pipeline take to process one base video?

Under 10 minutes for all 76 API calls. RenderIO processes commands in parallel on Cloudflare's edge network, so 15 variation calls finish in roughly the same time as one.

Can I use this pipeline with AI video tools other than HeyGen?

Yes. The pipeline is tool-agnostic after stage 1. Any MP4 output works, whether it comes from HeyGen, Synthesia, D-ID, Runway, or even screen recordings. The post-processing and variation stages don't care how the video was generated.

What happens if an API call fails mid-pipeline?

Each command returns a status you can poll. Failed commands return an error with details. In an n8n workflow, add an error branch that retries failed calls up to 3 times with a 10-second delay between attempts.

Do I need all 15 variations, or can I start with fewer?

Start with 3 color variations and skip speed variations. That gives you 12 platform-ready files (3 variations x 4 platforms) from 4 API calls. Add speed variations once you're comfortable with the workflow.

Which RenderIO plan fits a UGC pipeline?

Depends on your volume. For 1-5 base videos per month, the Starter plan ($9/month, 500 commands) is enough. For 10-13 base videos, Growth ($29/month, 1,000 commands). For 50+ base videos, Business ($99/month, 20,000 commands).

DEV Community