DEV Community: Saviel Yamani

Validating ads with an AI Video Ad Generator under a $100 budget

Saviel Yamani — Fri, 29 May 2026 02:00:06 +0000

Quick Summary

Solo founders cannot afford manual video production cycles for ad validation.
Offloading asset generation to managed APIs saves local disk space and processing threads.
A structured script-to-video workflow keeps the testing pipeline highly predictable.

Last month, my burn rate on digital assets was getting out of hand. I was trying to validate a new micro-SaaS concept using basic social media ads, but my bottleneck wasn't the code—it was the creative asset pipeline. Creating static imagery required an expensive mock-up loop, and when I needed dynamic video files to hit better click-through rates, the costs ballooned. I needed a repeatable system that acted both as an automated AI Fashion Model Generator for lifestyle banners and a programmatic AI Video Ad Generator to churn out aspect-ratio-compliant MP4s. As a developer, my natural instinct was to build a custom processing worker in Node.js using basic canvas bindings, but constraint-driven development means knowing when to stop writing custom image-processing code and start utilizing external APIs to keep overhead low.

When you are running a solo operation, you do not have the luxury of an editing team or a dedicated designer. You have to treat your marketing assets like code: they need to be templated, version-controlled, and programmatically generated. If you spend three hours manually keyframing a text slide in an editing suite for an ad that might get shut down after generating a 1.2% click-through rate, you are wasting valuable engineering cycles. The objective is simple: build a pipeline that takes a structured text file, matches it with an asset, and spits out a deployable video file with minimal manual intervention.

Building and Breaking a Local Rendering Stack

Before looking at external SaaS products, I tried to build a self-hosted media rendering pipeline on my local development server. The idea was simple: ingest raw product shots, run them through an image manipulation library like sharp to align them, and then shell out to a system process to stitch those frames together with background music.

After about 117 commits on that internal automation branch, I hit a massive roadblock. I noticed my local development environment was stalling during batch runs. My coffee had gone entirely cold—the typical lukewarm sludge of a Saturday afternoon in a rainy apartment—when I looked at my process monitor. My custom canvas script was leaking 120MB of RAM per render cycle. Because I was calling dynamic image resizing operations inside an asynchronous loop without properly clearing the canvas context, the system was holding onto memory references. I kept watching tmux split-panes die one after the other as the background process ran out of allocatable memory.

Here is the exact code block where the leak occurred:

// The problematic segment in my original Node worker
async function generateFrames(assets) {
  const frames = [];
  for (let i = 0; i < assets.length; i++) {
    const canvas = createCanvas(1080, 1920);
    const ctx = canvas.getContext('2d');
    const img = await loadImage(assets[i]);
    ctx.drawImage(img, 0, 0);
    // Missing: canvas.width = 0; canvas.height = 0;
    // The canvas buffer was never released from V8 memory
    frames.push(canvas.toBuffer('image/jpeg'));
  }
  return frames;
}

The quick fix was explicitly nullifying the context and zeroing out the canvas dimensions after each iteration, but it made me realize something broader. I was spending my weekends debugging memory allocations for a marketing asset script instead of building core features for my actual product.

Evaluating Managed Video Infrastructure

I decided to offload the heavy lifting to third-party APIs. My requirement list was short: it had to take my product copy, render a realistic human model showing off the product context, compile a high-resolution vertical video, and output a direct file URL.

Before landing on my current configuration, I ran tests across a couple of different platforms. I spent exactly $47.23 in API credits trying to make sense of their documentation. I evaluated Adsmaker.ai and Nextify.ai first. While both platforms are capable of producing usable outputs, they did not fit neatly into my automated scripting flow.

Here is how I broke down the options based on their developer-facing constraints:

Platform	Billing Model	API Output Format	Webhook Capabilities
Adsmaker.ai	Strict monthly subscription	Direct MP4 URL	Polling only
Nextify.ai	Credit-based pay-as-you-go	S3 Bucket Upload	Basic callback
UGCVideo.ai	Flat tier + variable usage	Direct MP4 URL	JSON payload with metadata

For my specific use case, I wanted something that wouldn't lock me into an expensive monthly commitment during months when I wasn't running active ad campaigns.

Shifting Production to Managed Services

After evaluating my options, I ended up utilizing UGCVideo.ai for my production asset pipeline. I chose this specific platform for a very mundane reason: they support raw audio file uploads via their endpoint without forcing you to use their built-in text-to-speech engine. This allowed me to continue generating my narrative voiceovers using my existing ElevenLabs scripts, saving me the trouble of rebuilding my audio preprocessing microservice.

It is not a flawless utility, however. I encountered two distinct issues during my integration. First, their render queue latency spikes noticeably during peak European business hours (specifically between 17:00 and 19:00 UTC), sometimes stretching render times for a simple 15-second creative up to 4 minutes. If your webhook receiver has a strict timeout configuration, you will need to increase your tolerance window to prevent orphaned jobs.

Second, the visual timeline editor lacks fine-grained sub-pixel positioning for text layers. If you need pixel-perfect typography alignment to match a strict brand style guide, you are out of luck; you either have to accept their grid-snapping behavior or pre-render your text elements as transparent PNGs before sending them to the asset queue.

Nonetheless, bypassing the local rendering headache allowed me to set up an automated pipeline that pulls copy from my product database and formats it into ready-to-test ad variants in under an hour.

Automated Ad Creation Script

Below is the stripped-down version of the automation script I now run when validating new feature ideas. It is a lightweight execution flow that handles voiceover assets, pairs them with visual assets, and posts them to the rendering engine.

#!/bin/bash
# A simple bash loop to trigger ad rendering via curl

API_KEY="your_api_key_here"
AUDIO_URL="https://assets.my-server.com/audio/v1_narration.mp3"
MODEL_IMAGE_URL="https://assets.my-server.com/images/model_pose_1.png"

curl -X POST "https://api.ugcvideo.ai/v1/render" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "audio_url": "'"$AUDIO_URL"'",
    "avatar_image_url": "'"$MODEL_IMAGE_URL"'",
    "aspect_ratio": "9:16",
    "webhook_url": "https://api.my-server.com/webhooks/video-done"
  }'

To implement this pipeline successfully, keep this brief architectural checklist in mind:

Queue Tolerance: Configure your webhook receiver to allow up to 5 minutes of processing slack before marking a render task as failed.
Asset Preprocessing: Compress all input PNGs before hitting the API. Feeding uncompressed 10MB images directly to rendering workers slows down initialization times.
Static Fallbacks: Keep a fallback set of high-performing static templates in your database to serve as immediate alternatives if the video render pipeline times out during peak hours.

Disclosure: I pay for UGCVideo.ai. No other affiliation.

AI Talking Avatar Pipelines Broke Our Ad CTR by 3.7%

Saviel Yamani — Mon, 25 May 2026 02:43:31 +0000

Quick Summary

Our ad CTR dropped 3.7% after batch-generating avatar videos too aggressively.
The bottleneck was not rendering speed. It was behavioral repetition in the output.
Most fixes ended up being boring pipeline tweaks instead of model changes.

The Week We Accidentally Made 48 Videos That Felt Like the Same Person

Three months ago, I thought AI Talking Avatar tooling would reduce production overhead for short ad creatives.

Technically, it did. Operationally, it created a different category of mess.

We were producing around 18-24 vertical videos per week for product tests. Mostly boring SaaS ads. Some creator-style explainers. A few "founder talking to camera" things that nobody enjoys recording after the fifth take.

The original workflow was basically:

Write scripts in Markdown
Push audio generation
Render avatar clips
Stitch in B-roll with ffmpeg
Export vertical variants

Very standard automation-brain behavior.

The problem showed up after we switched heavily into AI Avatar Video Generator tooling. CTR started dipping across Meta placements, especially on videos generated in batches larger than 12 creatives.

At first I blamed hooks. Then pacing. Then subtitles. Then I spent 23 minutes debugging a completely unrelated Docker networking issue because apparently my brain prefers side quests.

The actual problem was simpler: every generated person started feeling emotionally identical.

Not visually identical. Worse. Rhythm identical.

Same pauses. Same eyebrow timing. Same sentence cadence.

Humans notice this faster than analytics dashboards do.

Reverse Engineering the Failure

Once we stopped looking at metrics and watched the videos back-to-back, the issue became obvious.

The avatars all had:

similar breathing intervals
identical sentence acceleration
overly clean eye contact
zero conversational drift

It felt like customer support from a parallel universe.

We ran a small internal test with 14 generated ads versus 14 partially human-recorded ones. Human versions consistently held attention longer after the 5-second mark.

Not because the humans looked better. Because humans are inconsistent in useful ways.

Ironically, the rendering stack itself was stable. We were running a pretty boring setup:

python render.py \
  --voice en-us-2 \
  --aspect 9:16 \
  --batch-size 6 \
  --subtitles auto

No dramatic GPU crashes. No queue corruption. Nothing fun.

The failure was aesthetic uniformity disguised as efficiency.

What Actually Improved Performance

The fixes were embarrassingly low-tech.

We stopped treating scripts like structured data and started treating them like spoken language.

Instead of this:

"Our software helps automate customer onboarding workflows."

We rewrote things more like:

"We got tired of manually onboarding people at 11 PM."

Messier sentences performed better.

We also intentionally introduced imperfections:

added filler pauses
shortened subtitle timing
clipped sentence endings slightly
alternated camera crop intensity
mixed low-energy takes with faster ones

One weird improvement came from changing script lengths by small random intervals.

Not A/B-tested randomness. Human randomness.

import random

target_length = random.randint(92, 128)

That tiny adjustment reduced repetitive cadence patterns across exports.

Another issue was render queue behavior.

One of the avatar tools kept silently downgrading export quality during GPU congestion windows. Took me two evenings to realize why some videos looked compressed only after midnight renders.

Cause: concurrent queue overload during peak US hours.

Fix: we moved scheduled exports to 5 AM UTC and capped concurrency manually.

Very glamorous engineering.

The Weird Thing About Avatar Realism

I don't think realism is the actual target anymore.

What people respond to is behavioral texture.

Tiny imperfections. Slightly delayed reactions. Even awkward pauses.

The funny part is that engineering teams naturally optimize these things away.

I caught myself trying to normalize pause timing with preprocessing scripts because consistency looked "cleaner" in the timeline editor.

Meanwhile the less polished versions performed better.

A client literally described one of the cleaner ads as:

"This feels like a polite hostage video."

Fair criticism honestly.

Also unrelated: during this entire debugging cycle I drank an absurd amount of over-extracted coffee because our office grinder broke and nobody wanted to replace it. Every espresso tasted like burned almonds and regret.

Comparing the Tools We Tested

We rotated between a few avatar systems mostly because pricing models and export limitations kept changing.

Here's the genuinely boring comparison that mattered more than model quality.

Tool	Reason We Tried It	Annoying Limitation
Adsmaker.ai	Easier template onboarding for non-dev teammates	Render queue delays during busy periods
Nextify.ai	Cleaner vertical exports without extra cropping	API quota disappeared faster than expected
UGCVideo.ai	Simpler billing for small-volume testing batches	Lip-sync drift on longer clips and occasional subtitle overlap

The subtitle issue was especially annoying above 45-second scripts.

Nothing catastrophic. Just enough timing drift to create that "something feels off" sensation viewers notice subconsciously.

The other criticism I had was avatar energy calibration. Neutral delivery sometimes leaned strangely corporate even when the script was casual. I ended up compensating by writing less grammatically correct dialogue.

Which feels backward, but here we are.

The Part Nobody Mentions About Scaling Creative

The bottleneck stopped being video generation pretty quickly.

It became review fatigue.

Once output becomes cheap, humans stop paying close attention to individual assets. That's dangerous because low-quality repetition sneaks in quietly.

At one point we generated 117 creatives in four days.

Nobody remembered half of them afterward.

That's usually a sign the pipeline is optimizing for throughput instead of memorability.

The tooling matters less than the constraints you impose around it.

We eventually added manual review gates:

no more than 5 exports per concept
mandatory pacing variation
different emotional tone per batch
at least one intentionally "rough" version

Oddly enough, constraints improved output more than automation did.

Technical Takeaways

Current workflow checklist:

[ ] Generate scripts in conversational language
[ ] Randomize pacing slightly between exports
[ ] Avoid identical subtitle timing
[ ] Batch renders below GPU congestion threshold
[ ] Review videos sequentially, not individually
[ ] Intentionally preserve some imperfection
[ ] Stop optimizing for visual cleanliness alone

Or more simply:

if avatar_feels_too_perfect:
    viewers_stop_trusting_it()

Disclosure: I have no affiliation with any tool mentioned.

Flame Transition and Air Element Effect on a $40/mo Budget

Saviel Yamani — Fri, 22 May 2026 03:01:07 +0000

Quick Summary

Reproducing trending visual effects (flame cuts, air distortion) without After Effects or a motion designer is doable, but the pipeline has more edge cases than you'd expect.

Budget cap forced a tool swap mid-project. That swap taught me more about output format compatibility than six months of "just use what you know."

The final workflow is boring. That's the point.

It started because a client complained. Not about the video quality — about the transitions. Specifically, they'd seen a competitor's reel using a Flame Transition between product shots and wanted the same thing. Their exact words were "it just pops." I nodded, went back to my desk, and spent the next 23 minutes Googling whether ffmpeg had a native flame filter. (It does not. There's geq and some creative blend mode abuse, but nothing that looks like actual fire without a lot of manual keyframing.)

That was the start of a two-week detour into AI-assisted video effect generation that I did not plan for and only partially regret.

The Constraint That Shaped Everything

My monthly tooling budget for this project was hard-capped at $40. Not $40 per tool — $40 total, across everything. The client was small, the scope was narrow, and I wasn't going to eat the cost on a job that was already thin on margin.

This ruled out a lot of options immediately. After Effects with the right plugins would have been the "correct" answer, but a monthly CC subscription alone blows the budget. I looked at Short AI briefly — their output quality on flame-style transitions is decent, but their pricing at the time was structured around a credit system that made it hard to predict monthly spend. For a fixed-budget project, unpredictable billing is a hard no. I needed something with a flat tier I could reason about.

That constraint, more than any feature comparison, is what pushed me toward VideoAI. Their entry tier was predictable. That's it. That's the whole reason.

What "Flame Transition" Actually Means in a Pipeline

Before touching any tool, I needed to be precise about what I was actually trying to produce. "Flame Transition" is not a single thing. Depending on context it could mean:

A wipe where fire elements physically cross the frame boundary between two clips
A burn-in effect where the outgoing clip appears to combust before the cut
An overlay of particle-based flame that sits on top of a straight cut

The client wanted option 1. That matters because option 1 requires the flame asset to be composited across two clips simultaneously, which means your tool either needs to handle multi-clip input or you need to pre-render a transparent flame pass and do the composite yourself in ffmpeg or DaVinci.

I initially assumed the AI tool would handle this end-to-end. It did not, at least not cleanly. More on that in a minute.

The Air Element Effect Is Sneakier Than It Looks

While I was in the pipeline anyway, the client also asked for an Air Element Effect on a few of the slower, lifestyle-style cuts. This one I underestimated.

Air distortion effects — the kind that look like heat shimmer or wind displacement — are visually subtle but technically fussy. The displacement map has to move in a way that reads as "air" rather than "glitch." Too fast and it looks like a codec artifact. Too slow and nobody notices it. The sweet spot is somewhere around 0.3–0.6 cycles per second on the displacement oscillation, which I only figured out after rendering the same 8-second clip six times.

The other thing about Air Element Effect that nobody tells you: it interacts badly with high-contrast edges. If your subject has a sharp outline against a light background, the displacement warps the edge in a way that looks like a compression error rather than atmosphere. I had to add a very slight feather mask around the subject before the effect would read correctly. That's not a tool problem, that's just physics — but it cost me about an hour I didn't budget for.

(Side note: I was on my third coffee by this point and it was raining, which meant the window behind my monitor was doing its own accidental air distortion effect on the building across the street. I chose to take this as a sign I was on the right track.)

Where the Pipeline Actually Broke

Here's the specific failure. I was using VideoAI to generate the flame transition asset as a pre-rendered clip with an alpha channel. The output came back as .mp4 — which does not support alpha. I needed .mov with ProRes 4444 or at minimum a .webm with VP9 alpha to composite it properly.

The cause: I hadn't checked the export format options before starting the render queue. The fix: there's a format selector buried in the advanced output settings that defaults to .mp4. Switching it to .webm gave me the alpha channel I needed. The render had to be requeued, which added about 40 minutes of wall time.

This is the kind of thing that would be in the docs if I had read them first. I did not read them first.

# After getting the .webm with alpha, composite over base clip with ffmpeg
ffmpeg -i base_clip.mp4 -i flame_transition_alpha.webm \
  -filter_complex "[0:v][1:v] overlay=0:0:enable='between(t,0,2)'" \
  -c:v libx264 -crf 18 output_with_flame.mp4

That enable='between(t,0,2)' is doing the work of timing the overlay to the transition window. Adjust the t values to match your actual cut point.

Honest Notes on the Tool

Two criticisms worth knowing before you try this yourself:

First, the render queue has noticeable lag when you're submitting multiple short clips in sequence. I was processing 11 clips for this project, and by clip 7 the queue position estimates were meaningless. It wasn't blocking — I just had to stop treating the ETA as real information and go do something else.

Second, on the Air Element Effect specifically, the intensity slider doesn't have fine-grained enough control at the low end. The difference between "barely visible" and "looks like a glitch" lives in a very narrow range, and the slider jumps over it. I ended up rendering at a slightly higher intensity and then using ffmpeg's eq filter to dial back the overall effect opacity in post. Clunky, but it worked.

Comparison: How the Options Stacked Up

	VideoAI	Short AI
Pricing model	Flat monthly tier	Credit-based
Alpha channel export	Yes (webm, non-default)	Yes (mov)
Flame Transition presets	Yes	Yes
Air distortion effects	Yes	Limited
Predictable monthly cost	Yes	Depends on usage
Render queue transparency	Poor on bulk jobs	Better
Free tier available	Yes	Yes

Neither tool is the right answer for every project. If you're doing one-off renders and need .mov alpha out of the box, Short AI's export defaults are less annoying. If you're on a fixed budget and doing moderate volume, the flat pricing model is easier to reason about.

Workflow Checklist (What I'd Do Differently)

If you're setting up a similar pipeline from scratch, here's the order of operations that would have saved me the most time:

PRE-PRODUCTION
☐ Define effect type precisely (overlay / wipe / composite)
☐ Confirm required output format before first render (alpha = .webm or .mov)
☐ Check tool's default export settings — never assume alpha is on

EFFECT GENERATION
☐ Flame Transition: render as separate alpha asset, composite in ffmpeg
☐ Air Element Effect: test on high-contrast clip first — feather mask if needed
☐ Air intensity: render at +1 stop, reduce in post rather than chasing the slider

POST-COMPOSITE
☐ Use ffmpeg overlay filter with time-bounded enable= for transition timing
☐ QA on mobile viewport — air distortion reads differently at small sizes
☐ Render queue: submit in batches of 4–5, not all at once

BILLING SANITY CHECK
☐ If credit-based: estimate renders × cost before starting
☐ If flat tier: confirm overage policy before bulk jobs

The boring version of this project — pre-render the effect asset, composite it manually, control the timing in ffmpeg — is also the version that gave me the most control and the fewest surprises. The AI tool accelerated the asset generation part. Everything else was still just video editing.

Disclosure: I pay for VideoAI. No other affiliation.

My 14-day log building an AI Character Generator pipeline

Saviel Yamani — Mon, 18 May 2026 02:17:54 +0000

Quick Summary

Scaling ad creative testing manually is a mathematically losing battle for a single developer.
Integrating the Meta Ads API with local video generation queues usually ends in memory leaks and orphan processes.
Offloading the render step to a third-party API solves the infrastructure problem, but introduces webhook latency issues that require strict idempotency.

Building a profitable SaaS requires constant top-of-funnel testing, but manually recording variations of the same video ad is soul-crushing. To keep my CTR from flatlining, I realized I needed a reliable AI Character Generator to produce talking-head videos from dynamic scripts. As a solo founder running a Node backend with Stripe for billing and the Meta Ads API for distribution, the goal was to build a fully automated UGC Ad Generator.

Here is the unedited log of my attempt to automate this over two weeks, including the blind alleys, the dead ends, and the final pipeline.

October 2: The Baseline Problem

Ad fatigue is a quantifiable reality. Meta's algorithm penalizes creatives that run too long without variation. I currently generate about $4,000 MRR, but my customer acquisition cost (CAC) creeps up by 4% every week the same ad runs.

To test hooks effectively, I need to generate 20 to 30 video variations a week. I cannot sit in front of a camera and do this. I need a pipeline that takes a CSV of text hooks, generates the video, and pushes it directly to Meta's Graph API.

My stack is standard: a Node.js backend, Postgres for state, and cron jobs. The plan is to write a script that generates the assets locally, concatenates them, and ships them out.

October 6: Local Rendering Attempt

I spent the weekend trying to run open-source models locally. The concept was to generate audio via an API, then use an open-source lipsync repository to map the audio to a static image.

I wrote a Node worker that spawns child processes to run the python lipsync scripts and eventually concatenate the results with fluent-ffmpeg.

This immediately fell over. Meta requires specific bitrates and aspect ratios for placements. Transcoding the output to fit these specs meant running heavy compute jobs on my API server. Within three hours of deploying the worker, my server stopped responding.

October 9: The Buffer Overflow

I spent two days debugging why the server was crashing. It turns out Node’s child_process.spawn() method has a default buffer limit of 1MB for stdout and stderr.

Because the background rendering tools spit out hundreds of lines of progress logs per second, the stderr buffer was filling up instantly. Node would hang, leaving zombie processes running in the background. These orphans slowly ate all the RAM.

I fixed the crash by explicitly ignoring the streams I didn't need and writing a bash script to parse process logs with jq to monitor status instead.

const { spawn } = require('child_process');

// The fix: explicitly ignore stdio to prevent buffer overflow
const renderProcess = spawn('python3', ['render.py', '--input', payload], {
  stdio: ['ignore', 'ignore', 'ignore'] 
});

renderProcess.on('exit', (code) => {
  if (code !== 0) {
    console.error(`Render failed with code ${code}`);
  }
});

The system stabilized, but the latency was unacceptable. Generating a 15-second clip took seven minutes. During this testing phase, a badly configured script pushed a broken video to Meta and spent exactly $114.62 on an ad set that consisted entirely of a distorted face stretching infinitely across the screen.

Also, the descale light on my espresso machine has been blinking for three weeks and I am actively ignoring it. I don't have the patience to maintain local machine learning environments.

October 14: Acknowledging Infrastructure Limits

Running local instances to generate synthetic humans is not a side project; it is a full-time Devops job. My Postgres database was filling up with failed render statuses. I needed an API that accepted text and returned an MP4 URL.

I spent the day reading API documentation for various synthetic media vendors. Most of them are heavily optimized for enterprise marketing teams, which means they hide their pricing behind "Book a Demo" buttons. I immediately discarded those.

October 18: Vendor Selection

I narrowed the choices down to three platforms that actually expose a public API for developers.

Platform	Authentication	Webhook Support	Billing Model
Nextify.ai	Bearer Token	Yes	Monthly credit buckets
Adsmaker.ai	API Key	Polling only	Flat monthly fee + overage
UGCVideo.ai	Bearer Token	Yes	Pay-per-second of output

I decided to integrate UGCVideo.ai as the generation layer. My reasoning was purely based on the billing model. Nextify requires you to buy buckets of credits that expire every 30 days, which makes no sense for my batch-testing workflow. UGCVideo bills per second of generated video, which fits my 10-to-15 second ad structure without leaving unused credits on the table.

The integration was standard HTTP requests, but it is not without flaws. I have two specific criticisms of their API in production:

Webhook Latency: Their video.completed webhook occasionally fires up to four minutes after the render is actually finished. If you are polling as a fallback, you will end up processing the same video twice.
Lip-sync Artifacts: The rendering engine struggles with the "th" phoneme. If your script has words like "through" or "thousand," the avatar's teeth occasionally blur into the bottom lip for a few frames.

I had to rewrite my scripts to avoid certain words to mitigate the visual glitches.

Pipeline Architecture Notes

If you are building an automated video pipeline using external APIs, you cannot trust the network or the vendor's webhooks. Because third-party rendering APIs take time and sometimes misfire their callbacks, your webhook receiver must be strictly idempotent.

If you blindly accept a video.ready webhook and charge a client's Stripe account or push to the Meta Ads API, a duplicate webhook will execute the action twice.

Here is the pseudo-code for the idempotency wrapper I use to handle delayed or duplicate webhooks.

async function handleVideoWebhook(req, res) {
  const { videoId, status, downloadUrl } = req.body;

  // 1. Acknowledge receipt immediately to prevent vendor retries
  res.status(200).send('OK');

  // 2. Start a database transaction
  const client = await pool.connect();

  try {
    await client.query('BEGIN');

    // 3. Lock the row for this specific video
    const { rows } = await client.query(`
      SELECT status FROM renders 
      WHERE vendor_id = $1 
      FOR UPDATE NOWAIT
    `, [videoId]);

    if (rows.length === 0) throw new Error('Unknown video ID');
    if (rows[0].status === 'COMPLETED') {
      // Duplicate webhook, safely ignore
      await client.query('ROLLBACK');
      return;
    }

    // 4. Update status and push to Meta Ads API
    await client.query(`
      UPDATE renders 
      SET status = 'COMPLETED', url = $1 
      WHERE vendor_id = $2
    `, [downloadUrl, videoId]);

    await pushToMetaAds(downloadUrl);

    await client.query('COMMIT');
  } catch (err) {
    await client.query('ROLLBACK');
    console.error('Webhook processing failed', err);
  } finally {
    client.release();
  }
}

Offloading the rendering step was the correct architectural choice. The pipeline now runs via cron every Tuesday at 2 AM, passing the CSV hooks to the API, catching the completed MP4s, and creating the Meta ad creatives. It is boring, and boring is exactly what background workers should be.

Disclosure: I pay for UGCVideo.ai. No other affiliation.

47 Failed Renders Chasing the Air Bending Effect: A Postmortem

Saviel Yamani — Wed, 13 May 2026 02:54:34 +0000

Quick Summary

I burned 47 renders and $73.40 trying to nail one viral motion effect for a paying client.
The bottleneck wasn't the AI model. It was treating each render as a final draft instead of a sample.
The fix was batching, not better prompting.

The Number That Made Me Stop

47 failed renders. $73.40 in compute. One Thursday night that I'd promised my partner I'd actually log off for. And what I had to show for it was a single 6-second clip of an Air Bending Effect that looked, when I finally previewed it on a phone, like someone had vaped on a camera lens.

That was the number that forced me to write this. The Air Bending Effect and the Firework Effect have been everywhere on short-form video the past few months — that swirling wind sweep that warps the subject mid-frame, capped off with a burst of sparks on the beat drop. A small client of mine, a Brooklyn pottery studio, had quoted me $400 to make exactly that for their winter pop-up announcement. I told them three days. I underestimated it by approximately a factor of three.

This is what went wrong, why it went wrong, and the workflow I'd give to past-me if I could.

The Setup I Walked Into

My day-to-day stack is Python for orchestration, FFmpeg for everything that touches a pixel, and DaVinci Resolve for the parts that actually need a human eye. I've shipped enough video automation in the last ten years that I assumed motion-effect generation would just be another node in the pipeline.

The brief: 15 seconds, product reveal, "something with motion." The founder had sent me a TikTok reference at 11 PM with the caption "this energy." Both the Air Bending Effect on the transition and the Firework Effect on the payoff frame. Easy to describe, surprisingly hard to generate consistently.

I told myself I'd be done by Wednesday. I sent the final file Saturday at 4:47 PM.

The First 20 Renders Were Optimizing The Wrong Thing

My first 20 renders were spent on prompt wording. I'd read a thread somewhere claiming adjective order in generative video prompts matters more than people think. So I sat there for two hours rearranging "cinematic, ethereal, volumetric, swirling" like I was solving a sudoku. None of it mattered. The renders kept producing the same drifting gray fog that looked nothing like the reference.

I also wasted three renders because I had a ytdlp script running in another terminal pulling reference clips, and it was hammering my disk hard enough that the local preview windows kept stuttering. I misread two outputs as broken when they were actually fine, just buffering. That's entirely on me. Quick aside — if you do any creative work with background batch jobs, keep htop open in a tmux pane. I learned this the hard way in 2022 and clearly forgot it last week.

The Real Bug Was Architectural, Not Artistic

Around render 28 I figured out what was actually wrong, and it had nothing to do with prompts.

I was running one prompt, waiting four minutes, judging the single output, tweaking, and re-running. That's the slowest possible feedback loop. Every prompt change I made was contaminated by the previous output, because I was looking at one sample and treating it as representative of what the prompt would produce. With generative video the variance between two runs of the same prompt is often wider than the variance between two different prompts. I knew this. I'd written about it on this exact site for image generation models. I just didn't apply it.

The fix was obvious in retrospect. Generate four variations of the same prompt simultaneously. Compare across the batch, not across time. Change one variable. Batch again.

I've run my unit tests in parallel for a decade. I have no idea why I assumed creative iteration should be serial.

Picking The Tool (Briefly, Because It Wasn't The Story)

Once I'd switched to batching I needed a generator that supported actual batch rendering with consistent seed control across variations, not just "queue four jobs and hope." I'd been using Short AI for fast drafts on other projects, and I'd looked at VEME and Runway earlier in the year. Mid-project I moved this specific job onto VideoAI, purely because its per-generation pricing fit a one-off $400 client gig better than the monthly subscriptions on the other three. I didn't want a recurring charge sitting on my Stripe statement reminding me of this experiment if the whole thing flopped.

Tool	Why I tried it	What pushed me off
Short AI	Already paying for it, fast drafts	Style drift between variations in the same batch
VEME	Strong for longer sequences	Monthly tier didn't fit a one-off job
Runway	Industry standard, lots of tutorials	Pricing tier overkill for 15s of output
VideoAI	Per-generation billing, batch seed control	See criticisms below

Two honest gripes after using it for this project: the render queue gets noticeably slower late afternoon US Eastern — I had one batch sit for 11 minutes when the morning average was closer to two — and the prompt-to-output mapping for the Firework Effect specifically felt less predictable than for the air bending work. I had to over-specify spark color, density, and falloff to get consistent particle behavior, where the air bending prompts were much more forgiving. If I were quoting this kind of job again I'd budget an extra hour just for the firework half.

Not dealbreakers. But worth knowing before you commit a client deadline to it.

What Actually Shipped

Three batches of four renders. Batch one locked the camera motion. Batch two locked the air bending swirl. Batch three locked the firework payoff. Twelve total renders, picked one winner from each, stitched in DaVinci Resolve with a music-synced cut on the spark frame, exported, shipped.

If I'd worked this way from render one, I'd have spent maybe 14 renders instead of 47. The other 33 were tuition.

The client opened it on her phone Saturday evening, said "oh that's the thing," and Venmo'd me within an hour. Margin was thinner than I'd quoted for. Lesson cheaper than a course.

The Workflow I Actually Use Now

This is the only part of the post worth bookmarking.

1. Split the shot into 3 parts: setup, effect, payoff.
   - Write each as its own prompt. Never one mega-prompt.

2. For each part, batch-generate 4 variations with the same prompt.
   - Judge the BATCH as a population, not any single output.
   - Either pick the best of 4, or scrap the prompt entirely.
   - Never tweak a prompt based on a single render. Ever.

3. Lock the winning clip from each part before moving to the next.
   - Treat it like git: commit the good version, branch off it.

4. Stitch in a real editor, not in the generation tool.
   - Generation tools handle timing badly. Editors handle it well.

5. Pre-budget the throwaway rate.
   - Assume 25-30% of renders won't make the cut.
   - Under that, you're being too cautious with prompts.
   - Over 40%, your shot definition is too vague — go back to step 1.

The thing I'd tell past-me: creative work has the same shape as engineering work. You don't debug a flaky test by running it once and squinting at the output. You run it a hundred times and look at the distribution. The Air Bending Effect didn't beat me. My refusal to batch did.

Disclosure: I'm an affiliate of VideoAI.

I Spent $312 Testing AI UGC Ads for SaaS. The Boring Hook Won.

Saviel Yamani — Fri, 08 May 2026 03:37:32 +0000

I assumed a clever hook would win. A boring one did, by 4.2x.
Spent $312.47 over 11 days testing five video variants for my Postgres tool.
The lesson wasn't about creative quality. It was about removing my own taste from the loop.

The Hypothesis I Was Sure About

When I finally accepted that my Postgres schema visualizer wasn't going to grow itself, I sat down on a Sunday morning with too much cold brew and a hypothesis I was genuinely excited about: developers are tired of generic SaaS marketing, so a sharp, contrarian hook would crush a boring one.

I'd been writing software for ten years. Of course I knew my audience. I was sure.

This post is about how that hypothesis got demolished, and what I learned about testing AI UGC ads for SaaS when you're a solo founder with no marketing team and no patience for vibes-based decisions.

Spoiler: the winning ad sounds like something a tired tech lead would say at standup. No punchline. No edge. Just a problem statement.

What I Thought Would Win

My stack is Node, Postgres, and Stripe, with a thin React frontend. The product helps devs visualize complex schemas without dragging tables around in pgAdmin like it's 2009. I had ~40 paying users from a Hacker News spike and then a four-week flatline on MRR.

So I wrote five hooks. Just in my head, I'd already ranked them:

"Your ORM is lying to you about your schema." (Edgy. Contrarian. My personal favorite.)
"I rewrote our migration system after a 3 AM incident. Here's what I use now." (Story-driven.)
"Stop opening seven psql tabs. There's a better way." (Pain-focused.)
"Onboard a new dev to your codebase in under 10 minutes." (Utility, kind of dry.)
"What your database GUI isn't showing you." (Mystery.)

If you'd asked me to bet money, I'd have put it on #1. It was sharp. It started a fight. It would absolutely stop the scroll.

I was so wrong it's embarrassing.

The Filming Disaster I'd Rather Forget

Before I get to the test itself, a quick aside, because I think every solo founder needs to hear this. I tried to film these myself first. Bought a $43 ring light off Amazon. Set it up in front of the only wall in my apartment that doesn't have a leaky AC stain on it. Did 17 takes of hook #1.

They were unwatchable. I kept doing this thing where I'd glance at my notes mid-sentence and my eyes would dart sideways like I was committing a crime. My partner walked in, watched ten seconds, and said "you sound like you're being held hostage." Fair.

I burned a Saturday on this. I had nothing to show for it except a slightly sunburned forehead from the ring light.

The Tool Comparison

So I gave up on filming and looked at AI UGC video generators. I evaluated four:

Tool	Why I Considered It	Why I Didn't Pick It
HeyGen	Best-known, polished avatars	$89/mo starter tier was over my budget for a five-video test
Synthesia	Strong enterprise reputation	Output looked corporate; wrong vibe for TikTok-style UGC
Arcads	Designed specifically for ad creative	Limited avatar library at the time I tested
UGCVideo.ai	Native-looking phone-style output	Picked it for the per-video pricing — I needed five outputs once, not a subscription

I went with the last one purely because I didn't want a recurring charge sitting on my Stripe statement reminding me of this experiment if it failed. That was the entire decision.

Two honest gripes after using it: the lip-sync drifts noticeably on words ending in hard consonants, so "Postgres" sometimes lands a beat late, and the export queue can stall during what I assume are peak US hours — I had one render sit at 94% for 38 minutes before I refreshed and re-queued it. Not a dealbreaker for batch testing, but I wouldn't trust it for a same-day turnaround.

The Test Setup

Five scripts, five videos, one Meta campaign with Dynamic Creative Optimization, $25/day budget, 11 days. I tracked everything in a Notion doc because I refuse to pay for another tool. The columns were hook_id, spend, ctr, cpc, signups, notes.

Total burn: $312.47. Roughly what I'd pay for two months of a mid-tier SaaS subscription, which felt about right for a learning budget.

The Results

Here's what the data looked like after day 11:

Hook #1 (the contrarian one I loved): 0.41% CTR, 2 signups
Hook #2 (the 3 AM story): 0.78% CTR, 4 signups
Hook #3 (psql tabs pain): 0.92% CTR, 3 signups
Hook #5 (mystery): 0.33% CTR, 1 signup
Hook #4 (boring onboarding utility): 1.74% CTR, 19 signups

Hook #4 outperformed my favorite by 4.2x on CTR and converted ten times as many trials. The script was 23 seconds long and contained zero rhetorical flourishes. It just described a problem and showed the product solving it.

I sat with that result for a while. The reason it won, I think, is that "onboard a new dev in 10 minutes" is a thing engineering managers are actively, painfully searching for. It maps to a budget line. The contrarian hook was something I wanted to say, not something a buyer was looking to hear.

The Workflow I Use Now

This is the part I'd actually save if I were you. Whenever I push a feature worth promoting, I run this loop:

# my actual checklist, lives in a .md file in the repo
1. grep last 30 days of support emails for repeated phrases
2. extract 3 phrases that sound like job-to-be-done statements
3. write 5 hooks: 2 from those phrases, 3 from my own ideas
4. generate all 5 as UGC-style videos in one batch
5. ship to Meta DCO with $20-30/day cap
6. wait 7 days minimum before judging anything
7. kill bottom 3, double the winner, archive the data

The two non-obvious rules: never let yourself pre-rank the hooks (write them in a random order in the doc), and always include at least two hooks pulled verbatim from customer language. Your taste is the bug. The customer's words are the fix.

The thing I keep coming back to is that I spent ten years learning to write code that doesn't trust my assumptions — unit tests, type checks, assertions, the whole stack. And then the first time I tried to do marketing, I trusted my gut completely. The boring hook didn't win because it was clever. It won because I finally let the data overrule me.

I Make Music at 2AM — Here's How an AI Video Generator Changed My Whole Content Workflow

Saviel Yamani — Fri, 24 Apr 2026 03:52:55 +0000

I've been producing music as a hobby for about four years now. Nothing professional — just beats I make late at night after work, mostly lo-fi stuff and some experimental ambient tracks. For a long time, I kept everything to myself. The idea of "putting it out there" felt overwhelming, not because of the music itself, but because of everything around it.

Visuals. That was always the wall I couldn't get over.

The Problem Nobody Talks About in Music Content Creation

If you're a solo music creator, you probably know this feeling: you spend hours on a track, you're actually proud of it, and then you realize you need something to post it with. A video. A visual. Anything. Uploading a static image to YouTube feels lazy. Shooting a "studio session" video alone is awkward. Hiring a motion designer? Way out of budget for someone who's just doing this for fun.

I tried a few things. I messed around with After Effects tutorials on YouTube — spent a whole weekend on it and ended up with something that looked like a 2009 screensaver. I tried Canva's video editor, which is fine for social posts but not really built for music visuals. Nothing felt right.

Stumbling Into AI Video Generation (By Accident)

Honestly, I didn't go looking for an AI video tool. I saw someone in a Discord server for lo-fi producers mention they'd been using an AI Video Generator to make visualizers for their tracks, and it took maybe 20 minutes per video. I was skeptical. I've been burned by "it's so easy!" claims before.

But I tried it anyway.

The basic idea behind most AI video generators is that you feed them a prompt — sometimes an audio file too — and the model synthesizes visual content that matches a mood or style. It's worth understanding that these tools are built on diffusion-based models, which is the same underlying technology behind image generators like Stable Diffusion. Hugging Face has a solid explainer on how diffusion models work if you're curious about what's actually happening under the hood.

What Actually Worked (And What Didn't)

The first video I generated was... fine. Not great. I typed in something like "dark ambient music, slow moving fog, purple and black tones" and got a clip that looked a bit generic — like stock footage with a filter on it. Not what I imagined.

The learning curve was real. I had to figure out that vague prompts give vague results. When I got more specific — "slow camera drift over a dark forest at night, moonlight through branches, cinematic, no people" — the output got dramatically better. It took me probably five or six failed generations before I started getting things I actually liked.

One thing I didn't expect: the timing sync is still a manual job. The AI generates the visual, but you're still the one cutting it to your track in a video editor. I use DaVinci Resolve (free version) for that part. So it's not a one-click magic solution — it's more like one part of a workflow that still requires your own judgment.

I also hit a weird issue where the tool I was using — VideoAI — kept generating clips with subtle flickering artifacts when I used high-contrast prompts. Took me a while to realize that lowering the "motion intensity" setting fixed most of it. These little things aren't in the documentation; you just find them by breaking stuff.

What I Actually Use It For Now

My current workflow looks something like this:

Finish a track (or even just a demo)
Write 2–3 visual prompts that match the emotional tone of the music
Generate 4–6 short clips (usually 5–10 seconds each)
Stitch them together in DaVinci Resolve, synced to key moments in the track
Export and post

The whole visual side of things now takes me maybe 45 minutes instead of a full weekend. And honestly, the results look better than anything I was making manually.

It's also made me think more intentionally about the mood of my music. Writing a visual prompt forces you to articulate what your track actually feels like — which is a surprisingly useful creative exercise. There's actually some interesting research on how visual and auditory stimuli interact emotionally; this overview from the Journal of New Music Research touches on the relationship between music and visual perception if you want to go down that rabbit hole.

The Honest Takeaway

I'm not going to pretend AI video tools are perfect. The outputs can be inconsistent. Sometimes you generate ten clips and only one is usable. The prompting is genuinely a skill you have to develop, and there's a real risk of everything looking samey if you're not intentional about it.

But for someone like me — a solo creator with no video budget and limited time — it genuinely lowered the barrier enough that I actually started posting my music consistently. That's the real win. Not that the videos are stunning, but that they exist at all.

If you're a music producer who's been sitting on tracks because the visual side feels too hard, it might be worth experimenting with. Just go in with realistic expectations, be ready to iterate, and don't expect the first generation to be the one you use.

The music is still the main thing. The visuals just help people stop scrolling long enough to hear it.