DEV Community

Saviel Yamani
Saviel Yamani

Posted on

My 14-day log building an AI Character Generator pipeline

Quick Summary

  • Scaling ad creative testing manually is a mathematically losing battle for a single developer.
  • Integrating the Meta Ads API with local video generation queues usually ends in memory leaks and orphan processes.
  • Offloading the render step to a third-party API solves the infrastructure problem, but introduces webhook latency issues that require strict idempotency.

Building a profitable SaaS requires constant top-of-funnel testing, but manually recording variations of the same video ad is soul-crushing. To keep my CTR from flatlining, I realized I needed a reliable AI Character Generator to produce talking-head videos from dynamic scripts. As a solo founder running a Node backend with Stripe for billing and the Meta Ads API for distribution, the goal was to build a fully automated UGC Ad Generator.

Here is the unedited log of my attempt to automate this over two weeks, including the blind alleys, the dead ends, and the final pipeline.

October 2: The Baseline Problem

Ad fatigue is a quantifiable reality. Meta's algorithm penalizes creatives that run too long without variation. I currently generate about $4,000 MRR, but my customer acquisition cost (CAC) creeps up by 4% every week the same ad runs.

To test hooks effectively, I need to generate 20 to 30 video variations a week. I cannot sit in front of a camera and do this. I need a pipeline that takes a CSV of text hooks, generates the video, and pushes it directly to Meta's Graph API.

My stack is standard: a Node.js backend, Postgres for state, and cron jobs. The plan is to write a script that generates the assets locally, concatenates them, and ships them out.

October 6: Local Rendering Attempt

I spent the weekend trying to run open-source models locally. The concept was to generate audio via an API, then use an open-source lipsync repository to map the audio to a static image.

I wrote a Node worker that spawns child processes to run the python lipsync scripts and eventually concatenate the results with fluent-ffmpeg.

This immediately fell over. Meta requires specific bitrates and aspect ratios for placements. Transcoding the output to fit these specs meant running heavy compute jobs on my API server. Within three hours of deploying the worker, my server stopped responding.

October 9: The Buffer Overflow

I spent two days debugging why the server was crashing. It turns out Node’s child_process.spawn() method has a default buffer limit of 1MB for stdout and stderr.

Because the background rendering tools spit out hundreds of lines of progress logs per second, the stderr buffer was filling up instantly. Node would hang, leaving zombie processes running in the background. These orphans slowly ate all the RAM.

I fixed the crash by explicitly ignoring the streams I didn't need and writing a bash script to parse process logs with jq to monitor status instead.

const { spawn } = require('child_process');

// The fix: explicitly ignore stdio to prevent buffer overflow
const renderProcess = spawn('python3', ['render.py', '--input', payload], {
  stdio: ['ignore', 'ignore', 'ignore'] 
});

renderProcess.on('exit', (code) => {
  if (code !== 0) {
    console.error(`Render failed with code ${code}`);
  }
});
Enter fullscreen mode Exit fullscreen mode

The system stabilized, but the latency was unacceptable. Generating a 15-second clip took seven minutes. During this testing phase, a badly configured script pushed a broken video to Meta and spent exactly $114.62 on an ad set that consisted entirely of a distorted face stretching infinitely across the screen.

Also, the descale light on my espresso machine has been blinking for three weeks and I am actively ignoring it. I don't have the patience to maintain local machine learning environments.

October 14: Acknowledging Infrastructure Limits

Running local instances to generate synthetic humans is not a side project; it is a full-time Devops job. My Postgres database was filling up with failed render statuses. I needed an API that accepted text and returned an MP4 URL.

I spent the day reading API documentation for various synthetic media vendors. Most of them are heavily optimized for enterprise marketing teams, which means they hide their pricing behind "Book a Demo" buttons. I immediately discarded those.

October 18: Vendor Selection

I narrowed the choices down to three platforms that actually expose a public API for developers.

Platform Authentication Webhook Support Billing Model
Nextify.ai Bearer Token Yes Monthly credit buckets
Adsmaker.ai API Key Polling only Flat monthly fee + overage
UGCVideo.ai Bearer Token Yes Pay-per-second of output

I decided to integrate UGCVideo.ai as the generation layer. My reasoning was purely based on the billing model. Nextify requires you to buy buckets of credits that expire every 30 days, which makes no sense for my batch-testing workflow. UGCVideo bills per second of generated video, which fits my 10-to-15 second ad structure without leaving unused credits on the table.

The integration was standard HTTP requests, but it is not without flaws. I have two specific criticisms of their API in production:

  1. Webhook Latency: Their video.completed webhook occasionally fires up to four minutes after the render is actually finished. If you are polling as a fallback, you will end up processing the same video twice.
  2. Lip-sync Artifacts: The rendering engine struggles with the "th" phoneme. If your script has words like "through" or "thousand," the avatar's teeth occasionally blur into the bottom lip for a few frames.

I had to rewrite my scripts to avoid certain words to mitigate the visual glitches.


Pipeline Architecture Notes

If you are building an automated video pipeline using external APIs, you cannot trust the network or the vendor's webhooks. Because third-party rendering APIs take time and sometimes misfire their callbacks, your webhook receiver must be strictly idempotent.

If you blindly accept a video.ready webhook and charge a client's Stripe account or push to the Meta Ads API, a duplicate webhook will execute the action twice.

Here is the pseudo-code for the idempotency wrapper I use to handle delayed or duplicate webhooks.

async function handleVideoWebhook(req, res) {
  const { videoId, status, downloadUrl } = req.body;

  // 1. Acknowledge receipt immediately to prevent vendor retries
  res.status(200).send('OK');

  // 2. Start a database transaction
  const client = await pool.connect();

  try {
    await client.query('BEGIN');

    // 3. Lock the row for this specific video
    const { rows } = await client.query(`
      SELECT status FROM renders 
      WHERE vendor_id = $1 
      FOR UPDATE NOWAIT
    `, [videoId]);

    if (rows.length === 0) throw new Error('Unknown video ID');
    if (rows[0].status === 'COMPLETED') {
      // Duplicate webhook, safely ignore
      await client.query('ROLLBACK');
      return;
    }

    // 4. Update status and push to Meta Ads API
    await client.query(`
      UPDATE renders 
      SET status = 'COMPLETED', url = $1 
      WHERE vendor_id = $2
    `, [downloadUrl, videoId]);

    await pushToMetaAds(downloadUrl);

    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    console.error('Webhook processing failed', err);
  } finally {
    client.release();
  }
}
Enter fullscreen mode Exit fullscreen mode

Offloading the rendering step was the correct architectural choice. The pipeline now runs via cron every Tuesday at 2 AM, passing the CSV hooks to the API, catching the completed MP4s, and creating the Meta ad creatives. It is boring, and boring is exactly what background workers should be.

Disclosure: I pay for UGCVideo.ai. No other affiliation.

Top comments (0)