<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Maksims Gavrilovs</title>
    <description>The latest articles on DEV Community by Maksims Gavrilovs (@dasein108).</description>
    <link>https://dev.to/dasein108</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2834130%2F3829c1e4-0e97-4998-b2f2-2904b1d4f698.jpg</url>
      <title>DEV Community: Maksims Gavrilovs</title>
      <link>https://dev.to/dasein108</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dasein108"/>
    <language>en</language>
    <item>
      <title>Zero to Autopilot, Part 7: Closing the Loop — the Channel That Runs Itself</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:28:39 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 7 of 7 — the finale. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6"&gt;Part 1&lt;/a&gt; landscape · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp"&gt;2 pipeline&lt;/a&gt; · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b"&gt;3 free motion&lt;/a&gt; · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-4-the-cost-collapse-1050-006-per-video-16j3"&gt;4 cost&lt;/a&gt; · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-5-teaching-a-youtube-channel-to-remember-390g"&gt;5 memory&lt;/a&gt; · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-6-a-thompson-sampling-bandit-that-picks-the-next-video-3bpn"&gt;6 bandit&lt;/a&gt;. Now I remove myself from the loop.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 7): real-now.&lt;/strong&gt; Metrics refreshed from YouTube on 2026-06-12: 24 measured videos, 1,742 total views, 48 likes, &lt;strong&gt;+10 subscribers&lt;/strong&gt;, $5.04 measured production spend, 7 wins, 6 losses, and 11 neutral results. Small-channel data is noisy, but it is real enough to grade the loop honestly.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frchqrny27xdm6z1x79xt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Frchqrny27xdm6z1x79xt.png" alt="A frame from the channel — the audience the loop is built to grow, hands-off." width="768" height="1344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything's built. Now delete the operator.
&lt;/h2&gt;

&lt;p&gt;Here's where six parts leaves us. I can: turn an idea into a finished Short (Parts 2–3), for six cents (Part 4); remember every bet and score it against my own portfolio (Part 5); and decide what to make next with a bandit (Part 6). Each of those is a &lt;em&gt;command I run&lt;/em&gt;. The last step is making me unnecessary — a scheduler that runs the whole cycle while I sleep.&lt;/p&gt;

&lt;p&gt;There's exactly one thing that makes this hard, and it's not the AI. It's &lt;strong&gt;time&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The crux: measurement is deferred
&lt;/h2&gt;

&lt;p&gt;A freshly published Short's metrics are meaningless. Views/day, retention, engagement — they don't stabilize for &lt;strong&gt;48–72 hours&lt;/strong&gt;. So you cannot write the loop as a straight-line script (&lt;code&gt;ideate → produce → measure → learn&lt;/code&gt;), because between &lt;em&gt;produce&lt;/em&gt; and &lt;em&gt;measure&lt;/em&gt; there's a two-to-three-day wait, and during that wait the loop should be doing &lt;em&gt;other&lt;/em&gt; useful things (producing the next bet, reflecting on older ones).&lt;/p&gt;

&lt;p&gt;So the loop isn't a script. It's a &lt;strong&gt;state machine over time.&lt;/strong&gt; Each tick, it asks one question: &lt;em&gt;given the journal and the current time, what is the single most useful thing to do right now?&lt;/em&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/marketing/loop.py — the five possible actions
#   measure  →  one or more videos have matured; fetch their stats
#   learn    →  enough new measurements have accrued; reflect into strategy
#   ideate   →  the backlog is running low; generate fresh bets
#   produce  →  cadence allows another video; make the next backlog bet (budget-sized)
#   idle     →  nothing due (waiting on maturation or the produce cadence)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  &lt;code&gt;plan()&lt;/code&gt; — one tick, one decision
&lt;/h2&gt;

&lt;p&gt;The whole engine is a pure function: &lt;code&gt;plan(journal, now) → Plan&lt;/code&gt;. It returns the &lt;em&gt;one&lt;/em&gt; due action, in priority order. Measuring matured videos comes first (that data unlocks everything else), then reflecting, then refilling the backlog, then producing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;                      &lt;span class="c1"&gt;# cold-start | optimizing
&lt;/span&gt;    &lt;span class="nb"&gt;next&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;              &lt;span class="c1"&gt;# measure | learn | ideate | produce | idle
&lt;/span&gt;    &lt;span class="n"&gt;measure_due&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;     &lt;span class="c1"&gt;# entry ids past the maturation window
&lt;/span&gt;    &lt;span class="n"&gt;learn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;             &lt;span class="c1"&gt;# enough new measurements to reflect?
&lt;/span&gt;    &lt;span class="n"&gt;produce_entry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;         &lt;span class="c1"&gt;# the bet to produce next (chosen by the bandit)
&lt;/span&gt;    &lt;span class="n"&gt;produce_max_cost&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;   &lt;span class="c1"&gt;# budget cap for that produce
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the core of the decision — note it's all time-driven off &lt;code&gt;published_at&lt;/code&gt; and a few cadence knobs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;plan&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cfg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;loop_config&lt;/span&gt;
    &lt;span class="n"&gt;phase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cold-start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;in_cold_start&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;optimizing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# 1) measure: deployed videos past the maturation window, not yet measured
&lt;/span&gt;    &lt;span class="n"&gt;measure_due&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deployed&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                   &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;_age_hours&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;published_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;maturation_hours&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# 2) learn: count measurements newer than the last reflection
&lt;/span&gt;    &lt;span class="n"&gt;new_measured&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_learn_at&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;metrics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fetched_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_learn_at&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;learn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_measured&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;cfg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;learn_every&lt;/span&gt;

    &lt;span class="c1"&gt;# 3) produce/ideate cadence → pick the next bet with the bandit (Part 6)
&lt;/span&gt;    &lt;span class="c1"&gt;# ...priority: measure &amp;gt; learn &amp;gt; ideate (backlog low) &amp;gt; produce (cadence ok) &amp;gt; idle
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The knobs are all in one config — maturation window, produce cadence, how often to reflect, when to refill the backlog:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;LoopConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;maturation_hours&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;60.0&lt;/span&gt;          &lt;span class="c1"&gt;# wait ~2.5 days before measuring
&lt;/span&gt;    &lt;span class="n"&gt;min_hours_between_produces&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;20.0&lt;/span&gt;  &lt;span class="c1"&gt;# ≈ 1 video/day
&lt;/span&gt;    &lt;span class="n"&gt;daily_produce_cap&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
    &lt;span class="n"&gt;learn_every&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;                    &lt;span class="c1"&gt;# reflect after 3 new measurements
&lt;/span&gt;    &lt;span class="n"&gt;backlog_min&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;                    &lt;span class="c1"&gt;# ideate when planned bets drop below this
&lt;/span&gt;    &lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bandit&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;                  &lt;span class="c1"&gt;# next-bet picker (Part 6)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The driver: &lt;code&gt;tick&lt;/code&gt; and &lt;code&gt;autopilot&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;plan()&lt;/code&gt; decides; two CLI commands act. &lt;code&gt;studio marketing tick&lt;/code&gt; runs exactly one due action and exits — perfect for a cron job. &lt;code&gt;studio marketing autopilot&lt;/code&gt; loops ticks for a session. Put &lt;code&gt;tick&lt;/code&gt; on a schedule (cron, a systemd timer, &lt;code&gt;/loop&lt;/code&gt;) and the channel runs itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# one cron line ≈ a self-running channel&lt;/span&gt;
0 &lt;span class="k"&gt;*&lt;/span&gt;/6 &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt; &lt;span class="k"&gt;*&lt;/span&gt;  &lt;span class="nb"&gt;cd&lt;/span&gt; /path/to/slope-studio &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; studio marketing tick &lt;span class="nt"&gt;--channel&lt;/span&gt; pilot
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every 6 hours it wakes, asks &lt;code&gt;plan()&lt;/code&gt; what's due, does that one thing — measure a matured video, reflect, ideate, or produce the next bandit-picked bet — and goes back to sleep. The deferred-measurement problem disappears because the state machine simply &lt;em&gt;doesn't&lt;/em&gt; measure until &lt;code&gt;published_at + maturation_hours&lt;/code&gt;, and spends the wait producing and reflecting instead.&lt;/p&gt;

&lt;p&gt;A reproducibility detail that matters here: the bandit's RNG is seeded from journal state, so &lt;code&gt;tick&lt;/code&gt; called twice in the same state makes the &lt;em&gt;same&lt;/em&gt; decision. No double-producing, no races.&lt;/p&gt;

&lt;h2&gt;
  
  
  Even the channel setup is automated
&lt;/h2&gt;

&lt;p&gt;One loose end: a new channel needs a brand. So that's a lego-block too — &lt;code&gt;studio brand &amp;lt;spec.json&amp;gt;&lt;/code&gt; generates a full kit (banner, profile avatar, a transparent watermark logo, plus keywords and an SEO description) into &lt;code&gt;runs/_brand/&amp;lt;slug&amp;gt;/&lt;/code&gt;. Text-free generated art, with the wordmark composited in Pillow's safe area. Zero-to-channel, including the identity, is scriptable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Internal agents vs skill-based orchestration
&lt;/h2&gt;

&lt;p&gt;There are two ways to run this kind of loop.&lt;/p&gt;

&lt;p&gt;The first is &lt;strong&gt;internal agent orchestration&lt;/strong&gt;: the system owns the whole state machine, calls its own tools, and treats every step as part of one product. That is what &lt;code&gt;studio marketing tick&lt;/code&gt; does. It knows the journal schema, the maturation window, the budget config, and the next due action. It is tight, reproducible, and cron-friendly.&lt;/p&gt;

&lt;p&gt;The second is &lt;strong&gt;skill-based orchestration&lt;/strong&gt;: the same work is decomposed into portable operating instructions that any capable external LLM can follow — Claude, Codex, Gemini, or whatever agent shell you prefer. In that mode, the skill is the durable interface: &lt;em&gt;measure this channel, learn from the journal, pick a bet, deploy it, report the result&lt;/em&gt;. The external model brings reasoning, writing, critique, and research; the CLI remains the deterministic I/O layer. That is less sealed than a pure internal agent, but more flexible: you can swap models, run the same marketing workflow from different agent environments, and keep the operational knowledge outside any one vendor's hidden prompt.&lt;/p&gt;

&lt;p&gt;In practice I want both. The internal autopilot handles boring scheduled execution. The skills let a stronger external agent step in for strategy, critique, and one-off investigation without rewriting the studio.&lt;/p&gt;

&lt;h2&gt;
  
  
  The first thing autonomy taught me: cheap + automated = automated garbage
&lt;/h2&gt;

&lt;p&gt;The day the loop ran end-to-end with no one watching, it published a video that was technically fine and still felt wrong. The frames moved. The narration lined up. The audio ducked under the voice. It had all the machinery from the first six parts.&lt;/p&gt;

&lt;p&gt;But the story was weak.&lt;/p&gt;

&lt;p&gt;Some early Shorts were raw in a way that only became obvious after watching a batch together: not enough concrete explanation, inconsistent emotional arc, pretty frames carrying a script that didn't quite earn the viewer's minute. I had spent six articles making production cheap, and the first lesson of autonomy was blunt: a cheap content machine can manufacture weak stories faster.&lt;/p&gt;

&lt;p&gt;Effects are polish; &lt;strong&gt;content is the product&lt;/strong&gt;. A bandit picks a good &lt;em&gt;topic&lt;/em&gt;, but "topic" is not a script. The writing still has to deliver a real fact and a real feeling, and nothing in the pipeline was checking for that.&lt;/p&gt;

&lt;p&gt;So I added a new stage between &lt;em&gt;script&lt;/em&gt; and &lt;em&gt;spend&lt;/em&gt;: a &lt;strong&gt;content critic&lt;/strong&gt; (&lt;code&gt;stages/critic.py&lt;/code&gt;). It's an LLM-as-judge that reads the scenario and scores it on four things before a cent goes to image or video generation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/models.py — the bar a scenario has to clear
&lt;/span&gt;&lt;span class="n"&gt;CRITIC_CRITERIA&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic_revealed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;viewer comes away KNOWING the thing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fact_explained&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;a concrete fact/idea/event is STATED and EXPLAINED&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;informative_interesting&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;teaches something non-obvious with a curiosity gap&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;emotional_payoff&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lands a clear emotion&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each criterion returns pass/fail, a 1–5 score, one specific note, and &lt;code&gt;revision_notes&lt;/code&gt; the writer can act on. The important part is not the prompt. It's where the prompt sits in control flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retries&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;verdict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;critic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;script&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;verdict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;passed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;script&lt;/span&gt;
    &lt;span class="n"&gt;script&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;write_again&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;revision_notes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;verdict&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;revision_notes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real code keeps the best-scoring attempt, caps retries with &lt;code&gt;--critic-retries&lt;/code&gt;, and can either proceed-best (&lt;code&gt;--critic on&lt;/code&gt;) or abort (&lt;code&gt;--critic strict&lt;/code&gt;). No framework, no infinite loop, just a bounded &lt;code&gt;script -&amp;gt; critic -&amp;gt; rewrite&lt;/code&gt; gate inside &lt;code&gt;studio run&lt;/code&gt;. The headless cron inherits it by default, which is the entire point: the gate has to live where there is no human in the seat.&lt;/p&gt;

&lt;h2&gt;
  
  
  The rewrite that made the failure legible
&lt;/h2&gt;

&lt;p&gt;The cleanest example was Fermat.&lt;/p&gt;

&lt;p&gt;I had an older Short about Fermat's Last Theorem: &lt;a href="https://youtube.com/shorts/F3STKw8Nlr8" rel="noopener noreferrer"&gt;the note in the margin that took 358 years to solve&lt;/a&gt;. It had the right ingredients — Fermat's taunt, Andrew Wiles, a famous unsolved problem — but the story was soft. It gestured at the myth more than it explained the hook.&lt;/p&gt;

&lt;p&gt;The critic made the problem concrete:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"fact_explained"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2/5 — Wiles' proof is mentioned, but the fix and concepts are not explained"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"emotional_payoff"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2/5 — highlight Wiles' despair after the fatal flaw, then the triumph"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is a useful failure. "Make it better" is vague. "State the equation, explain the 358-year gap, show the fatal hole, then land Wiles alone finding the fix" is executable.&lt;/p&gt;

&lt;p&gt;So I re-made the Short with the same basic media path but a stronger scenario: &lt;a href="https://youtube.com/shorts/rozAXRztijQ" rel="noopener noreferrer"&gt;Fermat's Last Theorem&lt;/a&gt;. The new narration opens with the actual equation shape, names the margin note, gives Wiles the seven-year attic beat, and spends the payoff on the near-collapse of the proof:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"In 1994 he unveiled the proof. Then a referee found a fatal hole in it. For a year it looked dead, until, alone, Wiles suddenly saw how to fix it."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The rewrite is still an early read, so I do not mix it into the mature cohort dashboard below. But the direction was not subtle:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Version&lt;/th&gt;
&lt;th&gt;URL&lt;/th&gt;
&lt;th&gt;Views&lt;/th&gt;
&lt;th&gt;Likes&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Result&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;softer story&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/F3STKw8Nlr8" rel="noopener noreferrer"&gt;&lt;code&gt;F3STKw8Nlr8&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;$0.776&lt;/td&gt;
&lt;td&gt;P29, neutral&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;critic-guided rewrite&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/rozAXRztijQ" rel="noopener noreferrer"&gt;&lt;code&gt;rozAXRztijQ&lt;/code&gt;&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;$0.208&lt;/td&gt;
&lt;td&gt;P92, win&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Roughly &lt;strong&gt;10x the views&lt;/strong&gt;, some actual likes, and less money spent because the rewrite reused the cheap path instead of treating the whole thing as a fresh premium render. That's the kind of result I want from an eval: not an abstract "quality score," but a concrete edit that changes the video and the market response.&lt;/p&gt;

&lt;p&gt;There is a second lesson hiding inside this one. An eval can expose weak output, but it cannot author the fix by itself. On another video, the critic made the writer-model choice obvious: &lt;code&gt;The Universe Has No Edge&lt;/code&gt; failed with a cheap writer, then passed after switching to a stronger writer model. The cost floor and the quality floor live in different places. Keep visuals and motion cheap, but do not cheap out on the script when the whole video depends on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The results
&lt;/h2&gt;

&lt;p&gt;I refreshed the YouTube measurements on 2026-06-12. The journal had 24 measured videos, 4 planned bets, 1,742 total views, 48 likes, &lt;strong&gt;10 new subscribers&lt;/strong&gt;, 0 comments, and $5.04 of measured production spend. The average measured cost was about &lt;strong&gt;$0.21 per video&lt;/strong&gt;, with 7 wins, 11 neutral results, and 6 losses by the channel's own portfolio-relative scoring.&lt;/p&gt;

&lt;p&gt;The top of the portfolio is not one format. That's the useful fact. The loop found wins in philosophy, physics, math, and even poetry, but the winners all had a sharper emotional or conceptual promise than the flops.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rank&lt;/th&gt;
&lt;th&gt;Video&lt;/th&gt;
&lt;th&gt;Views&lt;/th&gt;
&lt;th&gt;Likes&lt;/th&gt;
&lt;th&gt;Retention&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Percentile&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/z-tig-SB8VE" rel="noopener noreferrer"&gt;Diogenes and the rich man's spotless palace&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;268&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;74.29%&lt;/td&gt;
&lt;td&gt;$0.207&lt;/td&gt;
&lt;td&gt;P100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/mcMqLgCod5o" rel="noopener noreferrer"&gt;Black hole information paradox&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;170&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;n/a&lt;/td&gt;
&lt;td&gt;$0.234&lt;/td&gt;
&lt;td&gt;P96&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/rozAXRztijQ" rel="noopener noreferrer"&gt;Fermat's Last Theorem&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;119&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;77.68%&lt;/td&gt;
&lt;td&gt;$0.208&lt;/td&gt;
&lt;td&gt;P92&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/M8b1NWXT9Ls" rel="noopener noreferrer"&gt;Rubaiyat — Awake!&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;132&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;61.40%&lt;/td&gt;
&lt;td&gt;$0.084&lt;/td&gt;
&lt;td&gt;P88&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;&lt;a href="https://youtube.com/shorts/b2Pg-k6wr-U" rel="noopener noreferrer"&gt;The Universe Has No Edge&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;182&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;50.07%&lt;/td&gt;
&lt;td&gt;$0.581&lt;/td&gt;
&lt;td&gt;P83&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The best result was not the most expensive one. Diogenes cost $0.207 and landed P100. Fermat's rewrite cost $0.208 and landed P92. Rubaiyat Awake cost $0.084 and landed P88. The signal is not "spend more." The signal is that the idea, story shape, and hook have to earn their minute before the pipeline spends anything.&lt;/p&gt;

&lt;p&gt;There is also a weird failure mode I do not want to over-explain: some videos appear to get no initial push at all. A few more are near-zero after several days: Rabies has 2 views, Population of Italy has 4, and the first Galois version has 5. I cannot tell from this data whether that is a metadata problem, a topic problem, a batch-upload penalty, a Shorts distribution quirk, or simply YouTube deciding not to kick-start those uploads. The honest takeaway is that a small-channel autopilot is not only learning audience taste; it is also learning around platform distribution randomness.&lt;/p&gt;

&lt;p&gt;That changes how I read losses. A video with 150 views and weak engagement is a content lesson. A video with 0 views is partly a distribution lesson. The loop can still score both, but the strategy should treat them differently: content critique for videos that got a chance, packaging and cadence experiments for videos that never entered the room.&lt;/p&gt;

&lt;h2&gt;
  
  
  The whole arc, in one breath
&lt;/h2&gt;

&lt;p&gt;A faceless AI channel is a &lt;strong&gt;search problem&lt;/strong&gt;. Make the unit cost trivial (free motion + right-sized models → six cents), record every video as a &lt;strong&gt;falsifiable bet&lt;/strong&gt; with measured cost and portfolio-relative score, let a &lt;strong&gt;bandit&lt;/strong&gt; exploit what wins while exploring the rest, and wrap it in a &lt;strong&gt;time-aware state machine&lt;/strong&gt; that runs the cycle unattended. None of it needed a vector DB, a fine-tune, or a render farm — just boring architecture, honest cost accounting, and a willingness to let the data, not the ego, pick the next video.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; When an agent's feedback is &lt;em&gt;delayed&lt;/em&gt;, don't model the workflow as a pipeline — model it as a &lt;strong&gt;state machine over time&lt;/strong&gt; whose tick asks "what's the single most useful thing to do &lt;em&gt;now&lt;/em&gt;?" Deferred reward (here, 48–72h of metric maturation) is the norm in real systems, not the exception; a &lt;code&gt;plan(state, now) → one action&lt;/code&gt; function handles it cleanly, stays cron-friendly, and (seeded from state) stays reproducible. Automate the boring 90%, be loud about the 10% you can't, and let the loop compound.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;That's the series.&lt;/strong&gt; Zero to autopilot: a channel that writes, renders, publishes, scores, and decides — for cents, on a schedule. It's all open source; go break it, fork it, or beat it.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; to watch the experiment continue: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the channel&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>automation</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 6: A Thompson-Sampling Bandit That Picks the Next Video</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Thu, 11 Jun 2026 14:31:53 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-6-a-thompson-sampling-bandit-that-picks-the-next-video-3bpn</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-6-a-thompson-sampling-bandit-that-picks-the-next-video-3bpn</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 6 of 7. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-5-teaching-a-youtube-channel-to-remember-390g"&gt;Part 5&lt;/a&gt; gave the channel a memory. This part gives it a &lt;strong&gt;decision&lt;/strong&gt; — the explore/exploit engine that picks what to make next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 6): real-now (mechanism).&lt;/strong&gt; The bandit, its math, and the real bets it's choosing among are shown below. &lt;em&gt;Which arms won&lt;/em&gt; (the quantitative payoff) lands in &lt;strong&gt;&lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt;&lt;/strong&gt;, once the data matures.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9py3di24h57y8ov7v5wm.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9py3di24h57y8ov7v5wm.png" alt="The channel's winner depicting " width="768" height="1344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The dilemma, made concrete
&lt;/h2&gt;

&lt;p&gt;Part 5 ends with the channel remembering that the "heretic mathematician" format won big. So… just make that forever? No — that's how a channel flatlines. But chasing novelty every time throws away everything you learned. This is the &lt;strong&gt;explore/exploit&lt;/strong&gt; dilemma, and for a small channel it bites hard: you have maybe one video a day of budget, so every pick is expensive. Over-exploit and you plateau; over-explore and you never compound.&lt;/p&gt;

&lt;p&gt;The honest first version of this in my code was a &lt;strong&gt;fixed 60/40 split&lt;/strong&gt; — 60% of the time make something like a known winner, 40% try something new. It works, but it's dumb in two specific ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It &lt;strong&gt;over-explores weak arms&lt;/strong&gt; — a 40% explore rate keeps spending on themes that have already proven mediocre.&lt;/li&gt;
&lt;li&gt;It's &lt;strong&gt;context-blind&lt;/strong&gt; — it treats "make a winner" as one bucket, ignoring &lt;em&gt;which features&lt;/em&gt; of past videos actually drove the wins.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;A contextual bandit fixes both. But first, a phase gate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 1: cold-start (you have no baseline yet)
&lt;/h2&gt;

&lt;p&gt;You can't run a bandit with zero data — and worse, on a brand-new channel even your "good" videos get tiny numbers, so absolute scores lie. So the channel runs a &lt;strong&gt;cold-start&lt;/strong&gt; phase first: the first 10 deployed videos are pure exploration, deliberately spread across themes, with no winner/loser judgment at all.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@property&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;in_cold_start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;deployed_count&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;bootstrap_target&lt;/span&gt;   &lt;span class="c1"&gt;# default 10
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Relative scoring (the portfolio percentile from Part 5) only unlocks once there are enough videos to &lt;em&gt;be&lt;/em&gt; a portfolio. Until then: explore, gather, don't pretend you know anything. After that, the bandit takes over.&lt;/p&gt;

&lt;p&gt;There's a hidden footgun here: the seed set teaches the bandit what the universe looks like. If the first ten videos are all the same shape, or all weak scripts, the posterior doesn't learn "audience taste" — it learns your bad sampling strategy. Cold-start needs &lt;strong&gt;varied but hooky&lt;/strong&gt; seed videos: different themes, different emotional promises, different formats, each still a real falsifiable bet. You are not feeding it random content. You are giving it enough distinct arms that "exploit the winner" will mean something later.&lt;/p&gt;

&lt;h2&gt;
  
  
  Phase 2: a warm-started contextual Thompson bandit
&lt;/h2&gt;

&lt;p&gt;Once there's a baseline, picking the next bet becomes a Thompson-sampling problem. Three design decisions make it fit this domain:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Context = what's knowable &lt;em&gt;before&lt;/em&gt; production.&lt;/strong&gt; A bet's features are its &lt;code&gt;theme&lt;/code&gt; and &lt;code&gt;tags&lt;/code&gt;. Not its effects or animators — those only exist &lt;em&gt;after&lt;/em&gt; rendering, so they're a learning/attribution concern, not a selection signal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Entry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Selection context known at planning time: theme + tags.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;feats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;theme&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;feats&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tag&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;feats&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. Per-feature Beta-Bernoulli posteriors, warm-started from the channel's base rate.&lt;/strong&gt; Each feature (&lt;code&gt;theme:infinity&lt;/code&gt;, &lt;code&gt;tag:heretic-format&lt;/code&gt;, …) gets its own Beta(α, β) win-probability posterior. The key trick: instead of an optimistic flat &lt;code&gt;Beta(1,1)&lt;/code&gt; prior — which makes every brand-new arm look amazing and causes over-exploration — I warm-start the prior from the channel's &lt;em&gt;actual&lt;/em&gt; base win rate, with a weak pseudo-count so real data dominates fast:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;posteriors&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prior_strength&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;2.0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_base_rate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;_evidence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;                  &lt;span class="c1"&gt;# channel's actual win rate
&lt;/span&gt;    &lt;span class="n"&gt;pa&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prior_strength&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;base&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prior_strength&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;stats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;defaultdict&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pa&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pb&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;                   &lt;span class="c1"&gt;# every feature starts here
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;win&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;_evidence&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;_features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;win&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;               &lt;span class="c1"&gt;# +win → α, +loss → β
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A win on a feature pushes its α up; a loss pushes β up. Wins and losses are the &lt;strong&gt;relative&lt;/strong&gt; outcomes from Part 5 — only measured, non-cold-start bets with a real percentile count as evidence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Score a candidate by Thompson-sampling its features and averaging.&lt;/strong&gt; For each planned bet, draw a sample from each of its features' posteriors and average them. Arms with little history have &lt;em&gt;wide&lt;/em&gt; posteriors, so they sometimes draw high — that's exploration emerging naturally from the uncertainty, no explicit explore-rate knob needed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prior&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;feats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_features&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;betavariate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prior&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;samples&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;rng&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;betavariate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prior&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;feats&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pick&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;planned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...):&lt;/span&gt;
    &lt;span class="c1"&gt;# highest Thompson draw wins; well-proven arms usually win,
&lt;/span&gt;    &lt;span class="c1"&gt;# but uncertain arms self-explore via their wide posteriors
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;planned&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A proven feature (&lt;code&gt;tag:heretic-format&lt;/code&gt; with lots of wins) has a tight, high posterior and usually wins the draw — &lt;em&gt;exploit&lt;/em&gt;. A fresh theme has a wide posterior and occasionally spikes — &lt;em&gt;explore&lt;/em&gt;. The split is &lt;strong&gt;adaptive and per-feature&lt;/strong&gt;, not a global 60/40.&lt;/p&gt;

&lt;p&gt;One practical detail: &lt;code&gt;pick()&lt;/code&gt; is stochastic (that's the whole point), but the caller passes a &lt;strong&gt;state-seeded RNG&lt;/strong&gt;, so the same journal state yields the same pick. That matters because the autonomous driver &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt; calls this from two places per cycle and they must agree.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where do new candidates come from? &lt;code&gt;ideate&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;The bandit &lt;em&gt;chooses among&lt;/em&gt; planned bets — but something has to &lt;em&gt;generate&lt;/em&gt; them, or it'd just reshuffle the same backlog. That's &lt;code&gt;ideate&lt;/code&gt;: an LLM proposes new bets from three inputs — the learned &lt;code&gt;Strategy&lt;/code&gt;, the most &lt;em&gt;relevant&lt;/em&gt; past episodes (via &lt;code&gt;recall&lt;/code&gt; from Part 5), and &lt;strong&gt;live trend signals gathered by web search&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ideate.generate(): build the prompt from learned state + recalled winners + trends
&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;next_seeds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;winning_patterns&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...])&lt;/span&gt;
&lt;span class="n"&gt;episodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recall_block&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# the relevant past, not the recent past
# → LLM returns new bets: {idea, hook, assumption, goal, theme, tags}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So exploration isn't random either — it's &lt;em&gt;informed&lt;/em&gt; exploration: new bets that rhyme with what's working and with what's currently trending, each still a falsifiable hypothesis. (No LLM key? It falls back to deterministic seeds from the strategy.)&lt;/p&gt;

&lt;h2&gt;
  
  
  The loop, end to end
&lt;/h2&gt;

&lt;p&gt;Put together, the decision engine is a closed cycle:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ideate ──► backlog of planned bets (each: idea + hook + assumption + tags)
   ▲                │
   │                ▼
 learn        bandit.pick()  ── exploit proven theme+tags, explore uncertain ones
   ▲                │
   │                ▼
 measure ◄──── produce + publish  (the cheap pipeline from Parts 2–4)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One guard rail sits inside that &lt;code&gt;produce&lt;/code&gt; step. The bandit picks &lt;em&gt;what&lt;/em&gt; to make, but a topic isn't a script — so before any money is spent, the chosen bet's scenario passes through a &lt;strong&gt;content critic&lt;/strong&gt; that can send it back for a rewrite if the writing is hollow. The bandit chooses the bet; the critic guards the execution. That gate is its own &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt; story (it exists because the autopilot, unsupervised, shipped an uninformative video); here it's enough to know the loop won't spend on a good pick with a bad script.&lt;/p&gt;

&lt;p&gt;And it's running on real bets right now. The journal's winning pattern — the heretic-mathematician format, &lt;code&gt;tag:heretic-format&lt;/code&gt; — means the bandit favors arms carrying that feature, which is why the backlog filled with Cantor (infinity → asylum), Galois (algebra → fatal duel), Russell (one sentence breaks math), Gödel (math can't prove itself). Each is the &lt;em&gt;same proven feature&lt;/em&gt; (heretic + tragedy + paradox) on a &lt;em&gt;new theme&lt;/em&gt; (set theory, group theory, logic) — textbook exploit-the-feature-while-exploring-the-instance. The bandit didn't invent the format; the memory learned it and the bandit is pressing it, while leaving room for the occasional wildcard to keep finding new winners.&lt;/p&gt;

&lt;p&gt;And those wildcards are real, not hypothetical. Alongside the math-mystery core, the loop has spent explore-picks on genuinely different lanes: deadpan academic humor ("how mathematicians catch a lion"), science-horror (a 100%-fatal-virus explainer), and a run of atmospheric Persian poetry. Each carries a &lt;code&gt;theme&lt;/code&gt;+&lt;code&gt;tags&lt;/code&gt; combination the posteriors had never seen, so their wide priors occasionally win the Thompson draw and buy a probe into fresh territory. &lt;em&gt;Which&lt;/em&gt; of those probes hardened into new winning arms is the quantitative reveal I'm saving for &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt; — the point here is that the exploration is informed and deliberate, emerging from each arm's uncertainty, not a blind 40% dice roll.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; A fixed explore/exploit split is a code smell — it's a constant where you want a &lt;em&gt;posterior&lt;/em&gt;. Make exploration &lt;strong&gt;emerge from uncertainty&lt;/strong&gt;: per-feature Beta-Bernoulli posteriors, Thompson-sampled, and the wide posteriors of under-tried arms self-explore for free. Two domain details earned their keep: &lt;strong&gt;warm-start the prior from your own base rate&lt;/strong&gt; (a flat optimistic prior over-explores), and &lt;strong&gt;only use features knowable at decision time&lt;/strong&gt; as context (everything else is post-hoc attribution). Seed the RNG from state so an autonomous caller is reproducible. The result is a picker with one honest knob (&lt;code&gt;prior_strength&lt;/code&gt;) instead of a magic split.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7: Autopilot.&lt;/a&gt;&lt;/strong&gt; Every piece now exists — cheap production, memory, scoring, a bandit, ideation. The finale wires them into a scheduler that runs the whole loop unattended (handling the 48–72h measurement wait), and — finally — reveals the &lt;strong&gt;real numbers&lt;/strong&gt;: what the channel did, what the autonomous loop decided, and what actually worked.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; to watch the experiment grow from zero: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the Lobachevsky Short&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>reinforcementlearning</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 5: Teaching a YouTube Channel to Remember</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Thu, 11 Jun 2026 14:22:32 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-5-teaching-a-youtube-channel-to-remember-390g</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-5-teaching-a-youtube-channel-to-remember-390g</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 5 of 7. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6"&gt;Part 1&lt;/a&gt; landscape · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp"&gt;Part 2&lt;/a&gt; pipeline · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b"&gt;Part 3&lt;/a&gt; free motion · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-4-the-cost-collapse-1050-006-per-video-16j3"&gt;Part 4&lt;/a&gt; cost collapse, which together turn an idea into a published Short for &lt;strong&gt;six cents&lt;/strong&gt;. Now the back half: giving the channel a brain. This part is &lt;strong&gt;memory&lt;/strong&gt;; &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-6-a-thompson-sampling-bandit-that-picks-the-next-video-3bpn"&gt;Part 6&lt;/a&gt; is &lt;strong&gt;deciding&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 5): real-now (qualitative).&lt;/strong&gt; The memory architecture and the &lt;em&gt;patterns&lt;/em&gt; it has already learned are real and shown below. The &lt;em&gt;quantitative&lt;/em&gt; virality scores are defined here but reported with real numbers in &lt;strong&gt;&lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt;&lt;/strong&gt;, after the data matures (≥1 week).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1d8bwhc8y00rox1swugb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1d8bwhc8y00rox1swugb.png" alt="A frame from the channel's breakout winner — the video the memory system is built to learn from and repeat." width="768" height="1344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Cheap content is a search problem
&lt;/h2&gt;

&lt;p&gt;Here's where Part 4 leaves us: I can make a hundred videos for six bucks. That sounds great until you realize it just moves the hard problem. Making videos was never the bottleneck — knowing &lt;strong&gt;which&lt;/strong&gt; videos to make is. A hundred random Shorts is a hundred coin flips. To make it a &lt;em&gt;search&lt;/em&gt;, the channel needs to remember what it tried and what happened.&lt;/p&gt;

&lt;p&gt;So I gave it a memory — and I modeled it on how human memory actually splits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Semantic memory&lt;/strong&gt; — the durable, generalized lessons ("tragic-genius math stories work").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Episodic memory&lt;/strong&gt; — the specific events ("on June 3 I posted the Lobachevsky one and it hit 50×").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt; — pulling the relevant episodes back up when facing a new decision.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the code that's three pieces: a long-term &lt;code&gt;Strategy&lt;/code&gt;, an episodic &lt;code&gt;Entry[]&lt;/code&gt; ledger, and a &lt;code&gt;recall()&lt;/code&gt; function. All of it lives in a per-channel journal (&lt;code&gt;runs/_marketing/&amp;lt;channel&amp;gt;/journal.json&lt;/code&gt; + a human-readable &lt;code&gt;.md&lt;/code&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  Every video is a falsifiable bet
&lt;/h2&gt;

&lt;p&gt;The unit of episodic memory is the &lt;code&gt;Entry&lt;/code&gt;, and its most important design choice is that &lt;strong&gt;a video isn't just content — it's a hypothesis.&lt;/strong&gt; Before anything renders, an entry states what it believes and how it'll be judged:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Entry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;idea&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;hook&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;assumption&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;   &lt;span class="c1"&gt;# WHY we think this goes viral  ← the falsifiable claim
&lt;/span&gt;    &lt;span class="n"&gt;goal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;         &lt;span class="c1"&gt;# the target, e.g. "&amp;gt;=P75 virality vs the channel's portfolio"
&lt;/span&gt;    &lt;span class="n"&gt;theme&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;explore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;   &lt;span class="c1"&gt;# an exploration bet, or exploiting a known winner?
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These aren't hypothetical — here are three real entries from my channel's journal, each a stated bet:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"idea"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Madman Who Counted Infinity: Cantor"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hook"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"He proved some infinities are BIGGER than others — and it drove him to the asylum."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assumption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Counterintuitive 'sizes of infinity' + tragic-genius arc = the exact Lobachevsky formula that hit 50x."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"theme"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"infinity / set theory"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"tags"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"math-mystery"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"heretic-format"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"explore"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"idea"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The Equation Written the Night Before a Duel: Galois"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hook"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A 20-year-old invented modern algebra in one night — then died in a duel at dawn."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assumption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Ticking-clock tragedy + 'one night of genius' is an irresistible curiosity gap."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;

&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"idea"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"One Sentence That Destroyed All of Mathematics: Russell"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"assumption"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"'One sentence breaks everything' is a pure curiosity gap; paradoxes are trending."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Writing the assumption &lt;em&gt;down, before publishing&lt;/em&gt; is the whole trick. When the numbers land, I'm not asking "did it do well?" — I'm asking "was my stated assumption &lt;strong&gt;right&lt;/strong&gt;?" That's the difference between a content diary and a science.&lt;/p&gt;

&lt;p&gt;And it pays off most when a bet is &lt;strong&gt;wrong&lt;/strong&gt;. I ran two videos in the same "deadpan academic humor" lane: one on the absurd, straight-faced &lt;em&gt;"how mathematicians catch a lion,"&lt;/em&gt; and one on relatable &lt;em&gt;"which scientist are you?"&lt;/em&gt; lab-personality bait. The first landed; the second didn't. Because both assumptions were on the record, the lesson came out &lt;em&gt;precise&lt;/em&gt; instead of vague: it isn't that "humor works," it's that the &lt;strong&gt;absurd, specific method&lt;/strong&gt; is the hook and broad relatability is not. Two falsifiable bets turned a hunch into a rule the next idea inherits — which is exactly what the reflection step (below) writes down.&lt;/p&gt;

&lt;h2&gt;
  
  
  The entry also remembers how it was made
&lt;/h2&gt;

&lt;p&gt;Each entry doesn't just record the bet and the outcome — it captures its own &lt;strong&gt;production telemetry&lt;/strong&gt;, pulled from the run manifest (the measured-cost ledger from Part 2 finally pays a second dividend):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;    &lt;span class="n"&gt;cost_usd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;          &lt;span class="c1"&gt;# measured $ to produce
&lt;/span&gt;    &lt;span class="n"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;                 &lt;span class="c1"&gt;# free | cheap | balanced | premium
&lt;/span&gt;    &lt;span class="n"&gt;video_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;          &lt;span class="c1"&gt;# kling | ltx | … | kenburns
&lt;/span&gt;    &lt;span class="n"&gt;animators&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;      &lt;span class="c1"&gt;# distinct animators across scenes
&lt;/span&gt;    &lt;span class="n"&gt;effects&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;        &lt;span class="c1"&gt;# fx + atmosphere used
&lt;/span&gt;    &lt;span class="n"&gt;n_scenes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So later I can ask not just "do heretic-mathematician stories win?" but "do the &lt;strong&gt;Flux-Schnell, kinetic-heavy, 60-second&lt;/strong&gt; ones win?" The memory spans &lt;em&gt;content&lt;/em&gt; and &lt;em&gt;craft&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That join turns out to matter more than I expected. Once cost, model, effects, music provider, SFX provider, and market outcome sit on the same row, the channel can ask craft questions too: did the $0.20 music bed actually earn its keep, or did a free synth drone do the job? Did the video win because of the topic, the sound, the animation style, or because the script finally had a real story? The first version of this was just "latest metrics." I later added age-bucket&lt;br&gt;
snapshots — 1d, 3d, 7d, 14d, 30d — because comparing a one-day upload to a thirty-day upload is lying with extra steps. The real slice-and-compare receipts stay in &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt;; the important design point here is that the memory row is no longer just an idea log. It's the place where production choices meet market feedback.&lt;/p&gt;
&lt;h2&gt;
  
  
  Scoring virality — against yourself
&lt;/h2&gt;

&lt;p&gt;When results come in, each entry gets a &lt;code&gt;virality&lt;/code&gt; score. The composite is deliberately simple and weighted toward what "viral" actually feels like — velocity — while guarding against cheap reach that doesn't convert:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;W_VELOCITY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;W_RETENTION&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;W_ENGAGEMENT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;W_SUBS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;virality&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;W_VELOCITY&lt;/span&gt;  &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log10&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;velocity&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;          &lt;span class="c1"&gt;# views/day, log-damped
&lt;/span&gt;        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;W_RETENTION&lt;/span&gt;  &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;retention&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;
        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;W_ENGAGEMENT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;engagement&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# ~5% engagement saturates
&lt;/span&gt;        &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;W_SUBS&lt;/span&gt;       &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;subs_conv&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;         &lt;span class="c1"&gt;# ~2% sub-rate saturates
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But an absolute score is meaningless for a small channel — 800 views might be a smash or a dud depending on your baseline. So the score that &lt;em&gt;decides&lt;/em&gt; anything is &lt;strong&gt;relative to the channel's own portfolio&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;relativize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;   &lt;span class="c1"&gt;# percentile rank within THIS channel's history
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;100.0&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;outcome&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;percentile&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cold_start&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cold_start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cold-start&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;75&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;win&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;percentile&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;loss&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;neutral&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A video is a &lt;strong&gt;win&lt;/strong&gt; if it lands in the top quartile &lt;em&gt;of my own videos&lt;/em&gt;, a &lt;strong&gt;loss&lt;/strong&gt; in the bottom quartile. Self-relative grading means the loop keeps working whether the channel does 50 views or 50,000 — it's always chasing &lt;em&gt;better than my median&lt;/em&gt;, which is exactly what compounding growth needs. (The real percentile numbers go public in &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Virality is the &lt;strong&gt;post-publish&lt;/strong&gt; eval — a verdict from the market, after the fact. It turns out to have a mirror image: a &lt;strong&gt;pre-spend&lt;/strong&gt; eval that judges a scenario &lt;em&gt;before&lt;/em&gt; a cent is spent — a content critic that asks "does this script actually reveal a fact and land a feeling?" and reworks it if not. Two judges, two timings: one on the idea after the audience sees it, one on the script before the camera rolls. The pre-spend critic earns its own story in &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt; — it exists because the autopilot, left alone, cheerfully published something hollow.&lt;/p&gt;

&lt;h2&gt;
  
  
  Recall: pulling up the relevant past
&lt;/h2&gt;

&lt;p&gt;When the channel is about to decide what to make next (Part 6), it shouldn't reason from its entire history — it should pull the episodes &lt;em&gt;relevant to the current direction&lt;/em&gt;. That's &lt;code&gt;recall()&lt;/code&gt;, and I kept it deliberately dependency-free: relevance is lexical token-overlap, ties broken by virality, so a relevant winner outranks a relevant flop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;recall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Top-k measured episodes most relevant to `query`, best first.
    Ties broken by virality, so a relevant winner outranks a relevant flop.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;q&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;scored&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nf"&gt;_relevance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;q&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;_episode_tokens&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;virality&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
              &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;measured&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;                 &lt;span class="c1"&gt;# only measured bets have a lesson
&lt;/span&gt;    &lt;span class="n"&gt;scored&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]),&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# by relevance, then virality
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;scored&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;rel&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The seam is intentional — you could swap in embeddings here — but lexical works, costs nothing, and runs offline. The default in this whole project is "free and local unless paying clearly wins."&lt;/p&gt;

&lt;h2&gt;
  
  
  Reflection: turning outcomes into strategy
&lt;/h2&gt;

&lt;p&gt;The last piece closes the loop. After a few new videos are measured, a &lt;code&gt;reflect()&lt;/code&gt; step feeds the scored bets to an LLM and asks it to update the long-term strategy — what's winning, what's losing, what to try next:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Strategy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;niche&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;current_direction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;winning_patterns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;losing_patterns&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="n"&gt;next_seeds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;        &lt;span class="c1"&gt;# concrete idea seeds for the next ideation
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't aspirational — it's the &lt;strong&gt;actual current strategy&lt;/strong&gt; in my channel's journal right now, rewritten by the LLM reflecting on real outcomes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="nl"&gt;"niche"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"math &amp;amp; physics mystery — rebels, paradoxes, forbidden knowledge (anime-noir visuals)"&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"winning_patterns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Outsider-genius figures, mysticism, and high personal stakes (early death, divine inspiration) in math/physics"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Intellectual shock + curiosity gaps framed around 'everything breaking' or a foundational paradox"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Absurdist, deadpan academic humor rooted in one specific bizarre concept (mathematicians hunting a lion)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Highly active, vivid, grand imagery in short poetic forms — not contemplative or melancholic ones"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="nl"&gt;"losing_patterns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Contemplative, melancholic, abstract poetry that lacks active imagery and a dramatic hook"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Pure science-horror missing the 'mystery / rebel / paradox' element central to the niche"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Generic 'relatable academic humor' that isn't rooted in a truly absurd, deadpan concept"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="s2"&gt;"Historical mysteries lacking an immediate, shocking, or deeply personal angle"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The important thing isn't the list, it's that the list &lt;strong&gt;moved&lt;/strong&gt;. The very first lesson this loop ever recorded was the cat-anatomy flop from Part 1: &lt;em&gt;don't batch-dump near-identical clips&lt;/em&gt; (that series cannibalized itself at three-to-six views each). Everything above is what it has reflected its way toward &lt;em&gt;since&lt;/em&gt; — through the math-hero winners, then a deliberate push outside the core into deadpan humor and a run of poetry reels. Look at that first losing pattern: "melancholic poetry that lacks active imagery." The loop learned that from &lt;em&gt;my own&lt;/em&gt; poetry experiments underperforming, and wrote itself a rule about it. That's the system caught in the act of learning, not a strategy I typed in.&lt;/p&gt;

&lt;p&gt;There's a heuristic fallback too (top and bottom performers by score) so reflection still works with no LLM key, but with one the lessons get sharper and feed straight back into the next idea. &lt;code&gt;reflect()&lt;/code&gt; writes &lt;code&gt;Strategy&lt;/code&gt;; ideation (Part 6) reads it. The snake eats its tail, and gets smarter each lap.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; If you want a system that improves, make every action a &lt;strong&gt;falsifiable bet recorded before the outcome&lt;/strong&gt; — idea, the &lt;em&gt;why&lt;/em&gt;, and the bar to clear. Split memory into durable strategy + an episodic ledger + cheap retrieval, mirror human memory, and score outcomes &lt;strong&gt;relative to the agent's own history&lt;/strong&gt; so the loop is scale-invariant. Capture production telemetry alongside results so the agent can learn craft, not just content. None of this needs a vector DB or a fine-tune — a JSON ledger, a weighted score, token-overlap recall, and one reflection prompt already close the loop.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-6-a-thompson-sampling-bandit-that-picks-the-next-video-3bpn"&gt;Part 6: The Bandit&lt;/a&gt;.&lt;/strong&gt; Memory tells the channel what &lt;em&gt;worked&lt;/em&gt;; now it has to &lt;em&gt;decide&lt;/em&gt; what to try next, balancing exploiting known winners against exploring new bets. I'll wire up a warm-started Thompson-sampling bandit over theme+tags — the actual explore/exploit engine that picks the next video.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; to watch the experiment grow from zero: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the Lobachevsky Short&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>llm</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 4: The Cost Collapse — $10.50 $0.06 per Video</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Mon, 08 Jun 2026 14:37:26 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-4-the-cost-collapse-1050-006-per-video-16j3</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-4-the-cost-collapse-1050-006-per-video-16j3</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 4 of 7. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6"&gt;Part 1&lt;/a&gt; landscape · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp"&gt;Part 2&lt;/a&gt; pipeline · &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b"&gt;Part 3&lt;/a&gt; free motion. Now the headline number: how a video went from &lt;strong&gt;$10.50 to six cents.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 4): real-now.&lt;/strong&gt; Every figure is a measured &lt;code&gt;cost_usd&lt;/code&gt; from the manifest, not an estimate. Code is straight from the repo.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wk4tia3i0udyochz7jg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wk4tia3i0udyochz7jg.png" alt="This Short cost six cents to produce. A Flux Schnell still (~$0.005), free motion, free voice, a sliver of AI sound." width="800" height="1400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the money actually goes
&lt;/h2&gt;

&lt;p&gt;After Part 3, motion is free — I animate stills in ffmpeg for $0. So a video's cost collapses to just two line items that can cost real money:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt; — one still per scene.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI video&lt;/strong&gt; — &lt;em&gt;if and only if&lt;/em&gt; I choose to use it on a scene.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything else (script on a local LLM, narration on edge-TTS, stitching, muxing, publishing) is already $0. So the cost game is entirely about those two knobs. Let's turn them down without making slop.&lt;/p&gt;

&lt;h2&gt;
  
  
  Knob 1: the per-second video bomb
&lt;/h2&gt;

&lt;p&gt;Recap of the villain from Part 1 — hosted AI image-to-video bills &lt;strong&gt;per second of output&lt;/strong&gt;. The cost of one clip isn't a flat fee; it's &lt;code&gt;duration × rate&lt;/code&gt;, snapped to the model's accepted duration grid:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/providers/video.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;spec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FAL_MODELS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;FAL_MODELS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;_clip_dur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;spec&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# seconds × $/s
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At kling's $0.07/s, a 150-second Short with AI video on every scene is &lt;strong&gt;~$10.50&lt;/strong&gt;. That was my first video. The fix isn't a cheaper model (though &lt;code&gt;ltx&lt;/code&gt; at $0.04/s helps) — it's &lt;em&gt;using AI video far more selectively&lt;/em&gt;, which I'll get to. First, the cheaper knob.&lt;/p&gt;

&lt;h2&gt;
  
  
  Knob 2: right-size the image model
&lt;/h2&gt;

&lt;p&gt;I had been defaulting every image to &lt;strong&gt;Nano Banana&lt;/strong&gt; ($0.039/img) — Google's Gemini 2.5 Flash Image. It's gorgeous and, crucially, supports character-reference consistency, which you want for photoreal or recurring-character content like my noir Kafka series:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4o5749e0d5wiy0y274n0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4o5749e0d5wiy0y274n0.png" alt="Nano Banana ($0.039): photoreal noir, character-consistent. Worth it when the look demands it." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;But a goofy "why do cats have fur" explainer doesn't need photoreal noir. It needs clean flat cartoon — and for that, &lt;strong&gt;Flux Schnell&lt;/strong&gt; at &lt;strong&gt;$0.003/megapixel&lt;/strong&gt; (~half a cent an image) is perfect:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wk4tia3i0udyochz7jg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0wk4tia3i0udyochz7jg.png" alt="Flux Schnell (~$0.005): flat cartoon, ~8× cheaper. Right tool, right price." width="800" height="1400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Same pipeline, one config change, &lt;strong&gt;~8× cheaper images&lt;/strong&gt; when the style allows. The lesson generalizes: &lt;em&gt;don't pay for capabilities the scene doesn't use.&lt;/em&gt; Photoreal + character-ref? Nano Banana. Flat/graphic/cartoon? Flux. The system keeps both wired as &lt;code&gt;image&lt;/code&gt; and &lt;code&gt;image_cheap&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tiers: one knob to set them all
&lt;/h2&gt;

&lt;p&gt;Rather than fiddle providers per stage, I bundled the choices into four tiers. This is the whole config:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/tiers.py
&lt;/span&gt;&lt;span class="n"&gt;TIER_PRESETS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;free&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;     &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;card&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kenburns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cheap&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fal-flux-schnell&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kenburns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                 &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sfx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;local&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;music&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;local&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;balanced&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fal-nanobanana&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edge&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;       &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auto&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;   &lt;span class="c1"&gt;# fill AI within budget
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;premium&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;image&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fal-nanobanana&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai-tts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;strategy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;    &lt;span class="c1"&gt;# AI every scene
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And the resulting cost ladder for a 150s Short:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Images&lt;/th&gt;
&lt;th&gt;Video strategy&lt;/th&gt;
&lt;th&gt;~Cost / 150s&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;free&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;offline card&lt;/td&gt;
&lt;td&gt;Ken-Burns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;wiring / drafts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cheap&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flux Schnell&lt;/td&gt;
&lt;td&gt;Ken-Burns&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$0.06&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;budget volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;balanced&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Nano Banana&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;auto&lt;/code&gt; (AI on hero scenes)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;= your &lt;code&gt;--max-cost&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;best per dollar&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;premium&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Nano Banana&lt;/td&gt;
&lt;td&gt;AI every scene&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$6–10+&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;quality first&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;code&gt;--tier&lt;/code&gt; sets everything; any &lt;code&gt;--*-provider&lt;/code&gt; flag still overrides a single choice. The interesting one is &lt;code&gt;balanced&lt;/code&gt;, because of how &lt;code&gt;auto&lt;/code&gt; works.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;code&gt;auto&lt;/code&gt;: spend the budget where it matters
&lt;/h2&gt;

&lt;p&gt;Most scenes are fine as a drifting still. A few — the hook, the climax, the outro — earn real AI motion. So &lt;code&gt;auto&lt;/code&gt; is a tiny greedy knapsack: rank scenes by priority, then spend the AI budget on the highest-priority ones that fit, Ken-Burns the rest.&lt;/p&gt;

&lt;p&gt;Priority is either explicitly set on a scene, or inferred by a &lt;strong&gt;hero heuristic&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/stages/clips.py
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_effective_priority&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;3.0&lt;/span&gt;    &lt;span class="c1"&gt;# the hook
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;2.5&lt;/span&gt;    &lt;span class="c1"&gt;# outro / CTA
&lt;/span&gt;    &lt;span class="c1"&gt;# ...else an evenly-spread beat gets a mid priority
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then fill the budget greedily, highest priority first:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;budget&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;max_cost&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;max_cost&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;_effective_priority&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scenes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;video&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;estimate_cost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fal-i2v&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;scenes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;duration_s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;spent&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;per_scene&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;scenes&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fal-i2v&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;   &lt;span class="c1"&gt;# animate this one with AI
&lt;/span&gt;        &lt;span class="n"&gt;spent&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;
    &lt;span class="c1"&gt;# else: it stays Ken-Burns (free)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So &lt;code&gt;--tier balanced --max-cost 1.50&lt;/code&gt; means: &lt;em&gt;"give me AI motion on the hook and a couple of key beats, free motion everywhere else, and never spend more than $1.50."&lt;/em&gt; You get the perceptual punch of AI video where viewers actually notice it, at a fraction of all-AI cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pre-flight that refuses to overspend
&lt;/h2&gt;

&lt;p&gt;Costs are estimated &lt;strong&gt;before&lt;/strong&gt; a single API call. &lt;code&gt;auto&lt;/code&gt; trims to fit; the rigid strategies (&lt;code&gt;all&lt;/code&gt;/&lt;code&gt;hybrid&lt;/code&gt;) &lt;em&gt;abort&lt;/em&gt; if the estimate exceeds the budget rather than surprise you with a bill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;studio estimate lobachevsky &lt;span class="nt"&gt;--budget&lt;/span&gt; 3
&lt;span class="gp"&gt;  kling   150s → $&lt;/span&gt;10.50   ❌ over budget
&lt;span class="gp"&gt;  ltx     150s → $&lt;/span&gt;6.00    ❌ over budget
&lt;span class="gp"&gt;  auto    (fills $&lt;/span&gt;3.00&lt;span class="o"&gt;)&lt;/span&gt;   ✅ AI on 6 hero scenes, Ken-Burns the rest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;studio run&lt;/code&gt; defaults to &lt;code&gt;--max-cost 3&lt;/code&gt; and the clips stage won't blow past it. A running guard backstops the estimate in case a provider returns something unexpected. The golden rule from Part 2 pays off here: because every provider reports its &lt;em&gt;real&lt;/em&gt; cost, the budget logic is exact, not hopeful.&lt;/p&gt;

&lt;h2&gt;
  
  
  The receipts
&lt;/h2&gt;

&lt;p&gt;Same ~150s video, every tier, measured from the manifests:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Build&lt;/th&gt;
&lt;th&gt;Images&lt;/th&gt;
&lt;th&gt;Video&lt;/th&gt;
&lt;th&gt;Sound&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;premium (my first video)&lt;/td&gt;
&lt;td&gt;Nano Banana&lt;/td&gt;
&lt;td&gt;kling, every scene&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$10.50&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;balanced&lt;/td&gt;
&lt;td&gt;Nano Banana ($0.585)&lt;/td&gt;
&lt;td&gt;a few AI clips ($0.75)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1.34&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cheap (Nano + free motion)&lt;/td&gt;
&lt;td&gt;Nano Banana&lt;/td&gt;
&lt;td&gt;Ken-Burns&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.585&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cheap (Flux + free motion + AI SFX)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flux ($0.054)&lt;/td&gt;
&lt;td&gt;Ken-Burns&lt;/td&gt;
&lt;td&gt;$0.0076&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.06&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;$10.50 → $0.06. About a 175× cut&lt;/strong&gt;, and the cheap version isn't a toy — it's a published Short with real narration, free motion, and atmosphere. The quality lever moved to &lt;strong&gt;art direction and pacing&lt;/strong&gt; (free), not the size of the model bill.&lt;/p&gt;

&lt;p&gt;A fair caveat, though: $0.06 is the &lt;em&gt;floor&lt;/em&gt; — a deliberately minimal Short. Once I turn the art-direction layer all the way up — parallax with generated plates, atmosphere, a vintage grade, a few Nano-Banana hero stills where they earn it — a &lt;strong&gt;fully art-directed, near-premium&lt;/strong&gt; video lands around &lt;strong&gt;$0.15–0.25&lt;/strong&gt;. That's still &lt;strong&gt;40–65× cheaper&lt;/strong&gt; than the ~$10 all-AI cut, at quality I genuinely can't tell apart in a feed. So read this as a ladder, not a single number:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Build&lt;/th&gt;
&lt;th&gt;~Cost&lt;/th&gt;
&lt;th&gt;When&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;floor (minimal effects)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.06&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;volume, throwaway tests&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;fully effected, near-premium&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$0.15–0.25&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;the realistic everyday build&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;premium (AI video every scene)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~$10&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;almost never worth it&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The honest anchor is that middle rung. "The $0.06 Short" is the hook; "a great-looking Short for a quarter" is the number I actually run on.&lt;/p&gt;

&lt;h2&gt;
  
  
  A field update: what the catalog actually cost
&lt;/h2&gt;

&lt;p&gt;I wrote that ladder as a forecast. Since then I've built a real back-catalog, so I can replace the forecast with the receipts — and the receipts are blunter than I expected. Across the dated runs in the repo, the &lt;strong&gt;median cost is well under a cent&lt;/strong&gt;, and the cheapest &lt;em&gt;published&lt;/em&gt; Shorts — full 60-second explainers with narration and free motion — measured &lt;strong&gt;$0.006&lt;/strong&gt;. That's a tenth of the $0.06 I just called the floor. The real floor turned out an order of magnitude lower:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Real video (measured from its manifest)&lt;/th&gt;
&lt;th&gt;What it used&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Chandrasekhar (60s)&lt;/td&gt;
&lt;td&gt;1 Flux still, free motion, edge-TTS&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.006&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gödel, "math can't prove itself" (60s)&lt;/td&gt;
&lt;td&gt;2 Flux stills, free motion, edge-TTS&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.012&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Galois, "the duel"&lt;/td&gt;
&lt;td&gt;Nano stills + a little AI SFX&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.18&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rabies (60s)&lt;/td&gt;
&lt;td&gt;5 Nano stills + SFX + a music bed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.41&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fermat, "the margin note"&lt;/td&gt;
&lt;td&gt;Nano stills + &lt;code&gt;ltx&lt;/code&gt; AI clips + music&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.78&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;What moves the needle is never the script or the motion — those are free in every row. It's exactly three opt-in knobs: &lt;strong&gt;Nano stills instead of Flux&lt;/strong&gt; (about $0.14–0.20 a video), the &lt;strong&gt;paid audio layer&lt;/strong&gt; (AI SFX plus a stable-audio music bed, about $0.20), and any &lt;strong&gt;AI video clips&lt;/strong&gt; (&lt;code&gt;ltx&lt;/code&gt; at $0.40 a hero beat). Turn all three off and you land at a sixth of a cent. Turn all three on and you're &lt;em&gt;still&lt;/em&gt; under a dollar. The only way back to a $10 video is AI motion on every scene, which — as the receipts above keep saying — you almost never should.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one line item I never cut: sound
&lt;/h2&gt;

&lt;p&gt;Cost-optimizing sounds like "cut everything," but the real skill is knowing what punches above its price — and then &lt;em&gt;keeping&lt;/em&gt; it. The audio layer is the clearest case. AI sound effects plus a music bed run about &lt;strong&gt;$0.0076 to $0.20&lt;/strong&gt; a video, rounding error next to the image and video knobs, and they do more for perceived quality than anything else on the list.&lt;/p&gt;

&lt;p&gt;The reason is that sound doesn't just decorate the picture — it cues the viewer's imagination to render the rest. A gust of wind, a distant bell, a low cello under a line of narration: the still shows a single frozen frame, but the soundscape makes the mind supply the motion, the depth, and the room the scene lives in. A fuller "video" plays out in the viewer's head that the image never actually contained. A real share of the production value a viewer &lt;em&gt;feels&lt;/em&gt; is happening behind their own eyes, prompted by a few cents of audio.&lt;/p&gt;

&lt;p&gt;So when I trim cost, sound is the last thing to go, and usually it never does. It's the highest return-on-investment line in the whole pipeline: pennies for atmosphere and liveness you can't buy any other way. "Right-size the spend" cuts both directions — kill the costs that don't earn their keep, and protect the cheap ones that punch far above their weight.&lt;/p&gt;

&lt;h2&gt;
  
  
  And cheaper actually wins
&lt;/h2&gt;

&lt;p&gt;That last claim isn't theoretical. My most expensive video was the premium Lobachevsky cut — AI video on every scene, ~$10.50, hours of fussing. One of my cheapest real bets was Ramanujan: 8 Nano-Banana stills, free ffmpeg motion plus a sliver of cheap &lt;code&gt;ltx&lt;/code&gt; on the hero beats, &lt;strong&gt;$0.65 measured, start to finish in about an hour&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;🎬 &lt;strong&gt;&lt;a href="https://www.youtube.com/shorts/rsk8XruZWBQ" rel="noopener noreferrer"&gt;Ramanujan: Math's Divine Genius → youtube.com/shorts/rsk8XruZWBQ&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The 65-cent video &lt;strong&gt;outperformed&lt;/strong&gt; the ten-dollar one. (Full numbers land in &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-7-closing-the-loop-the-channel-that-runs-itself-2194"&gt;Part 7&lt;/a&gt;, per the series' data policy — but the direction is already unambiguous.) That's the empirical version of the whole argument: once free motion clears the "doesn't look like slop" bar, extra dollars buy shockingly little. Production quality is barely a success factor — the hook, the subject, and the story are. So the right move is to floor the cost and spend your real effort on &lt;em&gt;which&lt;/em&gt; videos to make.&lt;/p&gt;

&lt;p&gt;And you don't have to take my word that this scales. Channels like &lt;strong&gt;&lt;a href="https://www.youtube.com/@cuentosdelachoza" rel="noopener noreferrer"&gt;Cuentos de la Choza&lt;/a&gt;&lt;/strong&gt; — Spanish folklore and horror tales — sit at &lt;strong&gt;400k+ subscribers across 1,200+ videos&lt;/strong&gt;, built on AI-generated stills, narration, and simple motion. Sit with that catalog size for a second: at 1,200 videos, &lt;em&gt;nobody&lt;/em&gt; is paying per-second for AI video on every scene. The unit economics simply don't allow it. The "post at volume" play and the "drive cost to the floor" play are the same play — which is the entire reason the rest of this series exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is the whole ballgame
&lt;/h2&gt;

&lt;p&gt;A $10 video is a precious artifact you agonize over. A six-cent video is an &lt;em&gt;experiment&lt;/em&gt;. At six cents, a hundred attempts costs six dollars — so I can stop guessing what works and start &lt;em&gt;measuring&lt;/em&gt; it. Cheap unit cost is what turns "make content" into "run a search over content."&lt;/p&gt;

&lt;p&gt;Which raises the obvious question: if I can cheaply make hundreds of videos, &lt;strong&gt;which&lt;/strong&gt; hundreds should I make? That needs a brain — a memory of what worked and a way to decide what to try next. That's the back half of this series.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Cost-optimize by &lt;em&gt;removing capabilities you aren't using&lt;/em&gt;, not by buying the cheapest everything. Free motion killed the per-second video bill; right-sizing the image model (photoreal vs flat) cut images ~8×; an &lt;code&gt;auto&lt;/code&gt; strategy spends the remaining budget only on the scenes that perceptually earn it; and a pre-flight estimate makes the cap exact. The payoff isn't the saved dollars — it's that a cheap-enough unit cost converts a craft into a &lt;em&gt;search&lt;/em&gt;, which is the only thing that makes the learning loop (next) affordable.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-5-teaching-a-youtube-channel-to-remember-390g/edit"&gt;Part 5: Memory &amp;amp; Self-Reflection&lt;/a&gt;.&lt;/strong&gt; Now that videos are cheap, the channel needs to remember. I'll build the per-channel journal — a long-term strategy plus an episodic ledger of every bet, with virality scoring and an LLM reflection step that turns measured outcomes into an updated game plan.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; to watch the experiment grow from zero: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the Lobachevsky Short&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>cost</category>
      <category>video</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 3: Giving a Still Image Real Motion for $0.00</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Mon, 08 Jun 2026 01:13:35 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 3 of 7. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6"&gt;Part 1&lt;/a&gt; was the landscape and my $10 wake-up call; &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp"&gt;Part 2&lt;/a&gt; was the 7-stage pipeline. This one is the engineering centerpiece: replacing paid AI video with &lt;strong&gt;free&lt;/strong&gt; motion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status: real-now&lt;/strong&gt; — real ffmpeg filtergraphs from the repo. Every effect here is playing in the live gallery (&lt;strong&gt;&lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;/strong&gt;); code is &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;open source&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ne3q46wtspdr03o2jv9.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F1ne3q46wtspdr03o2jv9.png" alt="Six free looks on one generated still — grain, vignette, chroma, glitch, sunrise, and a colour-graded variant. None of them cost a cent." width="744" height="870"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Viewers don't need generated video. They need motion.
&lt;/h2&gt;

&lt;p&gt;The recap from Part 1 is one line of arithmetic: hosted AI image-to-video bills &lt;strong&gt;per second&lt;/strong&gt; — kling at $0.07/s makes a 150-second Short cost about &lt;strong&gt;$10.50&lt;/strong&gt;. Fine for a single hero shot; absurd as the default for every scene when the whole strategy depends on making hundreds of cheap experiments.&lt;/p&gt;

&lt;p&gt;But viewers were never asking for &lt;em&gt;generated&lt;/em&gt; video. They want the &lt;strong&gt;feeling&lt;/strong&gt; of motion: a still that drifts, breathes, and cuts on the beat holds attention perfectly well. I'd internalized this years ago shipping indie games, where the entire craft is faking expensive things with cheap math — no budget for a particle artist, so you write a particle system; no budget for animation, so you parallax-scroll a few layers and call it atmosphere. The same instinct ports straight to AI media. Everything below is one still image, ffmpeg, and zero dollars.&lt;/p&gt;

&lt;h2&gt;
  
  
  ffmpeg is the whole trick: an effect is a string
&lt;/h2&gt;

&lt;p&gt;The quiet hero here is &lt;strong&gt;ffmpeg&lt;/strong&gt;. It ships with roughly 400 built-in filters, and an "effect" is just a few of them chained with commas — no render engine, no GPU shaders, no SDK, no per-call cost. One binary you already have. Every motion in this series is an ffmpeg filtergraph, which means &lt;em&gt;adding an effect is adding a string.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here is the entire implementation of &lt;code&gt;oldfilm&lt;/code&gt;, the vintage look:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[0:v]colorchannelmixer=.393:.769:.189:0:.349:.686:.168:0:.272:.534:.131,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# → sepia
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eq=contrast=1.12:saturation=0.82:brightness=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.035*sin(27*t)+0.025*sin(11*t)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# flicker
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;noise=alls=22:allf=t,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;     &lt;span class="c1"&gt;# film grain, re-rolled every frame
&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vignette=PI/4[v]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;          &lt;span class="c1"&gt;# darkened corners
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Read it like a Unix pipe; each comma is "then":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;colorchannelmixer&lt;/code&gt; — a 3×3 RGB matrix that maps the image to a sepia tone.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;eq=…brightness='…sin(t)…'&lt;/code&gt; — &lt;code&gt;t&lt;/code&gt; is the frame's timestamp, so brightness &lt;em&gt;wobbles&lt;/em&gt; over time: the projector-gate flicker. &lt;strong&gt;Time expressions are what make an effect animate&lt;/strong&gt; — &lt;code&gt;sin(t)&lt;/code&gt; here, a creeping &lt;code&gt;zoom&lt;/code&gt; in Ken-Burns next.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;noise=allf=t&lt;/code&gt; — &lt;code&gt;f=t&lt;/code&gt; re-randomizes the grain every frame, so it shimmers instead of sitting frozen.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;vignette=PI/4&lt;/code&gt; — darken the corners.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Four stock filters, one string, and it moves. A glitch is &lt;code&gt;rgbashift&lt;/code&gt; + &lt;code&gt;noise&lt;/code&gt;; chromatic aberration is just &lt;code&gt;rgbashift&lt;/code&gt;; rain is a particle layer composited with &lt;code&gt;overlay&lt;/code&gt;. The reason this channel can afford hundreds of videos isn't a cheaper model — it's that the effect budget is a text editor and &lt;code&gt;ffmpeg -filter_complex&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The effect families
&lt;/h2&gt;

&lt;p&gt;That one binary buys a whole vocabulary. The catalog sorts into a handful of families, each answering a different question — &lt;em&gt;what does this scene need?&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Camera motion&lt;/strong&gt; — &lt;code&gt;kenburns&lt;/code&gt;, &lt;code&gt;motion-drift{left,right,up,down}&lt;/code&gt;, &lt;code&gt;motion-zoom{in,out}&lt;/code&gt;, &lt;code&gt;pulse&lt;/code&gt;. The cheapest possible life: a still pans, drifts, breathes. The default for most scenes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Depth&lt;/strong&gt; — &lt;code&gt;parallax&lt;/code&gt;, &lt;code&gt;blurred-parallax&lt;/code&gt;. Real 2.5D: the foreground subject holds still while the background drifts behind it. For scenery with a clear subject.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kinetic type&lt;/strong&gt; — &lt;code&gt;kinetic&lt;/code&gt;. Emphasis: a headline slides in over the shot. For the hook or a key stat, not every scene.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atmosphere&lt;/strong&gt; — &lt;code&gt;rain&lt;/code&gt;, &lt;code&gt;snow&lt;/code&gt;, &lt;code&gt;fog&lt;/code&gt;, &lt;code&gt;embers&lt;/code&gt;, &lt;code&gt;blood&lt;/code&gt;, &lt;code&gt;petals&lt;/code&gt;, &lt;code&gt;leaves&lt;/code&gt;, &lt;code&gt;wind&lt;/code&gt;. Mood and a sense of place — the emotional weather, composited for free.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Colour &amp;amp; look grades&lt;/strong&gt; — &lt;code&gt;grain&lt;/code&gt;, &lt;code&gt;vignette&lt;/code&gt;, &lt;code&gt;oldfilm&lt;/code&gt;, &lt;code&gt;sunrise&lt;/code&gt;, &lt;code&gt;sunset&lt;/code&gt;, &lt;code&gt;godrays&lt;/code&gt;, &lt;code&gt;chroma&lt;/code&gt;. Tone and era. This family does the most to separate &lt;em&gt;intentional&lt;/em&gt; from &lt;em&gt;slop&lt;/em&gt;: grain and a vignette alone (the cover image is one still run through six of these) read as "graded by someone who cares."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Impact&lt;/strong&gt; — &lt;code&gt;flash[-white/-yellow/-red/-black]&lt;/code&gt;, &lt;code&gt;blood&lt;/code&gt;. A 2–3 frame punch for an action beat. Rare by design.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Characters&lt;/strong&gt; — &lt;code&gt;puppet&lt;/code&gt; (a cutout figure that hops or nods), &lt;code&gt;talkinghead&lt;/code&gt; (Rhubarb lip-sync). A figure that &lt;em&gt;acts&lt;/em&gt; or &lt;em&gt;speaks&lt;/em&gt;, with no avatar model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector&lt;/strong&gt; — &lt;code&gt;manim&lt;/code&gt;. Literal concept and maths visualization, 3Blue1Brown-style. The education power tool (and the one I haven't tamed — more below).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transitions&lt;/strong&gt; — &lt;code&gt;cut&lt;/code&gt;, &lt;code&gt;fade&lt;/code&gt;, &lt;code&gt;dissolve&lt;/code&gt;, &lt;code&gt;wipeleft&lt;/code&gt;, &lt;code&gt;slideup&lt;/code&gt;, &lt;code&gt;slice&lt;/code&gt;. Rhythm: how one scene becomes the next.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They're all the same idea underneath — a filtergraph string — so the rest of this piece takes apart the three most interesting ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the motion is wired
&lt;/h2&gt;

&lt;p&gt;Each scene names an &lt;code&gt;animator&lt;/code&gt;, and one dispatch function routes to the implementation. The important property is the last line: &lt;strong&gt;anything that fails falls back to Ken-Burns and records why in the manifest&lt;/strong&gt;, so a missing optional dependency degrades the look instead of breaking the render.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/animate.py
&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;animator&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kenburns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kenburns&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;ffmpeg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ken_burns&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;motion-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;    &lt;span class="n"&gt;ffmpeg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;motion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;preset&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kinetic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;             &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_kinetic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;parallax&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_parallax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;slice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;               &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;puppet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;              &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_puppet&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;talkinghead&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;         &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_talkinghead&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;manim&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;               &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_manim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The workhorse, Ken-Burns, is a single &lt;code&gt;zoompan&lt;/code&gt; expression — over-scale the source 2× first so the crop never reaches an edge:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/ffmpeg.py — ken_burns()
&lt;/span&gt;&lt;span class="n"&gt;vf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;crop=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;zoompan=z=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;min(zoom+0.0012,1.12)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:d=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:s=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;x&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:fps=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fps&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
      &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;x=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;iw/2-(iw/zoom/2)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:y=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ih/2-(ih/zoom/2)&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;z='min(zoom+0.0012,1.12)'&lt;/code&gt; creeps the zoom in a hair per frame, capped at 1.12×. The &lt;code&gt;motion-*&lt;/code&gt; presets are the same machine with different &lt;code&gt;z&lt;/code&gt;/&lt;code&gt;x&lt;/code&gt;/&lt;code&gt;y&lt;/code&gt; expressions — a whole family of movement from one filtergraph.&lt;/p&gt;

&lt;h2&gt;
  
  
  Parallax, the one effect ffmpeg can't do alone
&lt;/h2&gt;

&lt;p&gt;Parallax — hold the subject still, drift the background behind it for depth — is the exception to "an effect is a string." ffmpeg can composite layers but it can't &lt;em&gt;find&lt;/em&gt; a subject, so this one needs a small, very indie-dev hack first: &lt;code&gt;rembg&lt;/code&gt; cuts the subject (the static foreground), Python builds a clean background plane, and only then does ffmpeg drift the back and &lt;code&gt;overlay&lt;/code&gt; the front.&lt;/p&gt;

&lt;p&gt;The "clean background" is the whole problem. The naive version drifts the &lt;em&gt;original&lt;/em&gt; still behind the cutout — but that still already contains the subject, so you get a creepy &lt;strong&gt;ghost twin&lt;/strong&gt; smearing across the back. The fix is to give ffmpeg a background that's complete &lt;em&gt;behind&lt;/em&gt; the subject, two ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inpaint it out of the same image&lt;/strong&gt; (default) — a free blur-diffusion fill: repeatedly blur, then re-stamp the known pixels so the subject's hole heals with its surroundings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate a separate plate&lt;/strong&gt; — re-prompt the scene &lt;em&gt;without&lt;/em&gt; the subject (&lt;code&gt;--parallax-plates&lt;/code&gt;, +1 still). Cleaner, no inpaint guesswork.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/animate.py — _inpaint_subject() (heal the subject's hole)
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;iters&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;blurred&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ImageFilter&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;GaussianBlur&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;bg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;composite&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blurred&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;subject_mask&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# keep outside, heal inside
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk42cg613wp95oq8pbumv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fk42cg613wp95oq8pbumv.png" alt="The subject (left) is cut out and the background (right) healed, so the drifting plane has no ghost." width="652" height="575"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;There's also a cheaper third option that &lt;em&gt;embraces&lt;/em&gt; the twin: blur the drifting plane hard so the duplicate melts into soft bokeh (&lt;code&gt;blurred-parallax&lt;/code&gt;) — on busy backgrounds it reads as dreamy depth-of-field rather than a brittle cutout. A bug turned into a second legitimate look.&lt;/p&gt;

&lt;h2&gt;
  
  
  Text, and the font library that wasn't there
&lt;/h2&gt;

&lt;p&gt;Kinetic type slides a headline in over a gently pulsing still. The text is rendered by Pillow into a transparent PNG and &lt;code&gt;overlay&lt;/code&gt;-ed with an animated &lt;code&gt;y&lt;/code&gt; so it rises into place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# headline rises and settles over the first 0.6s
&lt;/span&gt;&lt;span class="n"&gt;over&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[bg][t]overlay=x=(W-w)/2:y=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;H*0.18 - 50*min(t/0.6,1)&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;:format=auto[v]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff5irmonldq99f9t16hyb.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ff5irmonldq99f9t16hyb.png" alt="A kinetic headline over a pulsing still — the text is a Pillow PNG, animated in ffmpeg." width="300" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why Pillow and not ffmpeg's &lt;code&gt;drawtext&lt;/code&gt;? Because the box this renders on has an ffmpeg built &lt;strong&gt;without &lt;code&gt;libfreetype&lt;/code&gt; and without &lt;code&gt;libass&lt;/code&gt;&lt;/strong&gt; — so &lt;code&gt;drawtext&lt;/code&gt; and &lt;code&gt;subtitles=&lt;/code&gt; both simply fail. Rather than fight the build, I render &lt;em&gt;all&lt;/em&gt; text — headlines and burned caption strips alike — as Pillow PNGs and overlay them. The constraint forced a more portable design that happens to give pixel-perfect typographic control.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing the effect: the model proposes, code constrains
&lt;/h2&gt;

&lt;p&gt;A library this size is worthless if every scene defaults to Ken-Burns — which is exactly where this started. So a small art-direction layer (&lt;code&gt;studio/artdirect.py&lt;/code&gt;) decides, with a deliberately hybrid policy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;script model proposes&lt;/strong&gt; a per-scene &lt;code&gt;animator&lt;/code&gt; / &lt;code&gt;atmosphere&lt;/code&gt; / &lt;code&gt;fx&lt;/code&gt; / &lt;code&gt;transition&lt;/code&gt;, choosing from a documented menu in its prompt, so the picks match the scene's mood — a duel gets &lt;code&gt;embers&lt;/code&gt; and a red &lt;code&gt;flash&lt;/code&gt;; a memory gets &lt;code&gt;oldfilm&lt;/code&gt;; a landscape gets &lt;code&gt;parallax&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;deterministic pass then constrains it&lt;/strong&gt;: it validates the names, fills anything the model skipped with position and keyword heuristics (hook → &lt;code&gt;kinetic&lt;/code&gt;, scenery → &lt;code&gt;parallax&lt;/code&gt;), and applies taste caps — a &lt;code&gt;flash&lt;/code&gt; is an impact, so it survives on at most one scene; a single atmosphere can't blanket the whole video.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;"Model proposes, code constrains" recurs throughout this project; it's a good default whenever you want a model's judgement without its inconsistency. And because the same pass runs on the keyless &lt;code&gt;stub&lt;/code&gt; path, every video gets real art direction instead of a wall of identical pans.&lt;/p&gt;

&lt;p&gt;One concrete payoff: cheap punctuation for violence without gore (which also keeps the image model's content filter happy). A red &lt;code&gt;flash&lt;/code&gt; on the cut plus a &lt;code&gt;blood&lt;/code&gt; overlay, a few frames total — the viewer's mind fills in the rest, the narration carries the meaning, and it costs nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one I haven't cracked: manim
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.manim.community/" rel="noopener noreferrer"&gt;Manim&lt;/a&gt;, the engine behind 3Blue1Brown, is the most promising tool here and the least solved. True vector animation — a circle morphing into a square, a graph plotting itself, an equation transforming term by term — is close to a cheat code for an educational channel, rendered crisp for $0. A scene can carry a &lt;code&gt;manim_code&lt;/code&gt; field the model writes, and the pipeline renders it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl6f4bgajb5zrisq9ip9s.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fl6f4bgajb5zrisq9ip9s.png" alt="Manim demos rendering as real vectors in the gallery — a rising sun, a morphing shape, a sine curve drawing itself, an orbit, a bar chart." width="800" height="287"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The catch is getting a model to author &lt;em&gt;good, literal, compiling&lt;/em&gt; manim on demand. It reaches for abstract moving lines when what sells is the literal shape; the code is indentation-sensitive; and a meaningful fraction of generated scenes fail and fall back to Ken-Burns. For now it's hand-authored for hero beats, not trusted to the loop — the single biggest unlock left for the educational side, and squarely on the roadmap. If you've cracked LLM→manim, I genuinely want to hear it.&lt;/p&gt;

&lt;h2&gt;
  
  
  And the ears
&lt;/h2&gt;

&lt;p&gt;Visual motion is only half of "not slop"; a silent Short feels dead. So there's a matching audio layer — AI-generated sound effects plus a music bed ducked under the narration via sidechain compression (the voice always wins; the bed sits at −24 dB). On one Short that entire layer cost &lt;strong&gt;$0.0076&lt;/strong&gt;. The "make it feel produced" budget, picture and sound together, rounds to zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  The road not taken: self-hosting the video model
&lt;/h2&gt;

&lt;p&gt;There's a tempting middle path I should address, because every engineer asks it: the video models are open-weight now — why not run one &lt;em&gt;locally&lt;/em&gt; and get real AI video for free too? I have a MacBook M4 with 36 GB of unified memory, so I wired a local &lt;a href="https://github.com/comfyanonymous/ComfyUI" rel="noopener noreferrer"&gt;ComfyUI&lt;/a&gt; + &lt;strong&gt;Wan 2.2 5B&lt;/strong&gt; backend into the pipeline as a &lt;code&gt;local-i2v&lt;/code&gt; provider and found out. Short version: it works, it's free, and it's a draft-tier toy you should keep out of your render path.&lt;/p&gt;

&lt;p&gt;The log, honestly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;fp8 weights are broken on Apple's MPS backend&lt;/strong&gt; — they load and produce &lt;code&gt;NaN&lt;/code&gt;. So everything is GGUF-quantized (Wan 5B at Q4 ≈ 3.4 GB, plus a ~3.6 GB text encoder).&lt;/li&gt;
&lt;li&gt;The &lt;em&gt;full-precision&lt;/em&gt; version (~22 GB resident) plus the video VAE-decode spike blew past physical RAM, and because MPS has no real offload, macOS &lt;strong&gt;swapped and hung the whole machine&lt;/strong&gt; — not the process, the OS. The fix is a PyTorch MPS watermark cap so a runaway allocation kills the process cleanly instead.&lt;/li&gt;
&lt;li&gt;Even stable, it's slow: a &lt;strong&gt;2-second clip took about 15 minutes&lt;/strong&gt;, and per-step time &lt;em&gt;accelerates&lt;/em&gt; off a cliff once memory pressure starts evicting.&lt;/li&gt;
&lt;li&gt;And it improvises. On the Persian-miniature still below, Wan added genuine motion — then warped the ornate border and &lt;strong&gt;invented a hooded figure&lt;/strong&gt; that wasn't in the source.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc38b6tntspvqo0mjozte.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc38b6tntspvqo0mjozte.png" alt="Real motion from a locally-run Wan clip — and a hooded figure the model hallucinated that wasn't in the source still. Free, impressive that it runs, and not production." width="480" height="832"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Set that against the hosted option — kling renders a 6-second hero clip in under a minute for about 42 cents — and "free" local generation costs you 15+ minutes, a fragile machine, and a worse result. Free isn't free when it's measured in wall-clock. So the verdict loops right back to this article's thesis: free ffmpeg motion for the overwhelming majority of scenes, a few cents of &lt;em&gt;hosted&lt;/em&gt; video for the rare hero shot, and if you must run local, cap it to 1–2 seconds of motion on one or two scenes and Ken-Burns the rest. It stays in the repo as a draft-tier provider — glad I tried it, glad I didn't ship it.&lt;/p&gt;

&lt;p&gt;That last pattern is exactly how I built &lt;a href="https://www.youtube.com/watch?v=fos3g5OAP5k" rel="noopener noreferrer"&gt;this 55-second Rubaiyat reel&lt;/a&gt;: two of its four scenes got ~2 seconds of local Wan motion (then hold the last frame for the rest of the line), the other two are pure Ken Burns — total video-generation cost, $0. It's the honest sweet spot for local i2v on a Mac: a brief breath of real generated motion where it counts, free camera motion everywhere else.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Before paying a generative model, ask what the viewer actually needs — usually the &lt;em&gt;perception&lt;/em&gt; of motion and intention, not literally generated video. A &lt;code&gt;zoompan&lt;/code&gt; expression, a parallax composite, a grain overlay, and a ducked music bed deliver that for nothing, and the indie-game-dev instinct (fake the expensive thing with cheap math) ports directly to AI media. Route every effect through one module, give each a graceful fallback, and the pipeline gets cheaper &lt;em&gt;and&lt;/em&gt; sturdier at once.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-4-the-cost-collapse-1050-006-per-video-16j3"&gt;Part 4: The Cost Collapse&lt;/a&gt;, $10 → $0.06.&lt;/strong&gt; With motion free, the full cost model: per-second video math, right-sizing the image model (Nano Banana vs Flux Schnell), the tier system, the &lt;code&gt;auto&lt;/code&gt; strategy that spends only on hero scenes, and the &lt;code&gt;--max-cost&lt;/code&gt; pre-flight that refuses to overspend.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; to watch the experiment grow from zero: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the channel&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ffmpeg</category>
      <category>python</category>
      <category>ai</category>
      <category>video</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 2: One Line of Text a Published Short, in 7 Stages</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Sat, 06 Jun 2026 06:17:19 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 2 of 7. &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6"&gt;Part 1&lt;/a&gt; covered the landscape and my $10 wake-up call. This one is the architecture: how a single line of text becomes an uploaded Short without me ever opening a video editor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 2): real-now.&lt;/strong&gt; Code, file layout, and measured costs straight from the repo. No audience metrics — those are sandbagged to Part 7.&lt;/p&gt;

&lt;p&gt;⭐ &lt;strong&gt;The whole thing is open source: &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;.&lt;/strong&gt; Clone along — there's a zero-API-key smoke test at the bottom.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6hs3s6u05bs6wep3vys.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fh6hs3s6u05bs6wep3vys.png" alt="The opening frame of the Lobachevsky Short — railroad tracks vanishing into a question mark. This is what " width="768" height="1344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The mental model: a video is a Makefile
&lt;/h2&gt;

&lt;p&gt;Most "AI video generator" tools are a single monolith — one giant button, one black box, and when scene 14 comes out cursed you get to regenerate all 14. I've shipped enough software to know that's the wrong shape.&lt;/p&gt;

&lt;p&gt;So I stole the model from build systems: &lt;strong&gt;a video is a directed pipeline of stages, each stage is a pure function from files to files, and the whole thing is idempotent.&lt;/strong&gt; Re-run a stage, it skips work that's already done. Blow away one artifact, only that stage (and its dependents) rebuild. It's &lt;code&gt;make&lt;/code&gt; with a YouTube upload at the end.&lt;/p&gt;

&lt;p&gt;Here's the pipeline, top to bottom:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; idea ──► [1 script] ──► 01_script.json        (timed scenes + narration)
            │
            ├──► [2 visuals] ──► 02_visuals/scene_NN.png
            │
            ├──► [2.5 narrate] ─► 05_voice/scenes/*.mp3 + timing.json + captions.srt
            │
            ├──► [3 clips] ────► 03_clips/scene_NN.mp4   (animate the stills)
            │
            ├──► [4 stitch] ───► 04_stitched.mp4         (transitions, no audio)
            │
            ├──► [5 voice] ────► 05_voice/final.mp4      (TTS + music muxed)
            │
            ├──► [6 save] ─────► 06_final.mp4            (platform master)
            │
            └──► [7 publish] ──► YouTube
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every arrow writes a file. Every file lives under one run directory. Which brings us to the most important design decision in the whole project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything is a file under &lt;code&gt;runs/&amp;lt;id&amp;gt;/&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;No database. No hidden state. One run = one directory, and the directory &lt;strong&gt;is&lt;/strong&gt; the state:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;runs/lobachevsky/
├── project.json          # the manifest: provider + cost + done-flag per stage
├── 01_script.json        # scenes, narration, title, hashtags
├── 02_visuals/scene_01..15.png
├── 03_clips/scene_NN.mp4
├── 04_stitched.mp4
├── 05_voice/
│   ├── scenes/*.mp3       # per-scene TTS
│   ├── timing.json        # per-scene durations (drives clip lengths)
│   ├── captions.srt
│   └── final.mp4
├── 06_final.mp4          # the master you upload
├── 06_final.json         # SEO title/description/tags
└── 07_publish.json       # the YouTube video id, once live
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This sounds almost too simple, but it buys you everything:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Debuggability&lt;/strong&gt; — something looks off? Open the PNG. Read the JSON. No "inspect the pipeline state" tooling needed; &lt;code&gt;ls&lt;/code&gt; and an image viewer are the debugger.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resumability&lt;/strong&gt; — kill the process at scene 9, restart, it picks up at scene 9.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotency&lt;/strong&gt; — stages check for their own output and skip it. Re-running &lt;code&gt;visuals&lt;/code&gt; won't re-bill you for 15 images you already have (&lt;code&gt;--force&lt;/code&gt; when you actually want to regenerate).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version control of *artifacts&lt;/strong&gt;* — every authored video in the repo is a folder you can diff, copy, or hand-edit.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Canonical paths live in exactly one place (&lt;code&gt;studio/paths.py&lt;/code&gt;), so no stage ever hardcodes a filename:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scene_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;visuals_dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;scene_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;sid&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;02&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;master&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;06_final.mp4&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Each stage is a CLI subcommand (and they chain)
&lt;/h2&gt;

&lt;p&gt;The pipeline is a &lt;a href="https://typer.tiangolo.com/" rel="noopener noreferrer"&gt;Typer&lt;/a&gt; app. Every stage is its own subcommand, so you can run the whole thing or surgically poke one stage:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# the whole pipeline, one idea in, one Short out:&lt;/span&gt;
studio run &lt;span class="s2"&gt;"lobachevsky geometry explained in a fun way"&lt;/span&gt; &lt;span class="nt"&gt;--duration&lt;/span&gt; 150

&lt;span class="c"&gt;# or drive it stage by stage and inspect between steps:&lt;/span&gt;
&lt;span class="nv"&gt;RID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;studio init &lt;span class="s2"&gt;"lobachevsky..."&lt;/span&gt; &lt;span class="nt"&gt;--duration&lt;/span&gt; 150&lt;span class="si"&gt;)&lt;/span&gt;
studio script  &lt;span class="nv"&gt;$RID&lt;/span&gt;     &lt;span class="c"&gt;# → 01_script.json   (read it! confirm the narration is real)&lt;/span&gt;
studio visuals &lt;span class="nv"&gt;$RID&lt;/span&gt;     &lt;span class="c"&gt;# → 02_visuals/*.png&lt;/span&gt;
studio status  &lt;span class="nv"&gt;$RID&lt;/span&gt;     &lt;span class="c"&gt;# render the manifest: what's done, what it cost&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The stage order is one list, and &lt;code&gt;run&lt;/code&gt; just walks it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;STAGE_ORDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;script&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;visuals&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;narrate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;clips&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;stitch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;audio&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;voice&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;save&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Adding a stage = write a pure function in &lt;code&gt;stages/&lt;/code&gt;, add a subcommand, drop its name in that list. Adding a &lt;em&gt;provider&lt;/em&gt; (a new image model, a new TTS) doesn't touch the pipeline at all — more on that next.&lt;/p&gt;

&lt;h2&gt;
  
  
  The provider contract: every model reports its own cost
&lt;/h2&gt;

&lt;p&gt;Here's the design choice I'm proudest of, because it's what makes the &lt;em&gt;whole rest of the series&lt;/em&gt; possible. Every media-producing provider — every LLM, image model, video model, TTS — returns the same dataclass:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@dataclass&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;GenResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;cost_usd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;     &lt;span class="c1"&gt;# the REAL cost, computed by the provider
&lt;/span&gt;    &lt;span class="n"&gt;latency_s&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;note&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;cost_usd&lt;/code&gt; is not an estimate I jotted in a spreadsheet. The Nano Banana provider returns &lt;code&gt;$0.039&lt;/code&gt;. The kling provider computes &lt;code&gt;seconds × $0.07&lt;/code&gt;. The Ken-Burns animator returns &lt;code&gt;$0.00&lt;/code&gt;. So when a stage runs, the manifest records &lt;strong&gt;measured&lt;/strong&gt; cost, not guessed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;StageRecord&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;done&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="n"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
    &lt;span class="n"&gt;cost_usd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.0&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Manifest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;total_cost_usd&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cost_usd&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the foundation. You can't optimize what you don't measure, and you definitely can't put a &lt;em&gt;budget-aware bandit&lt;/em&gt; (Part 6) on top of costs you're guessing at. Every dollar in this series is a real dollar the system reported on itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Six small LLMs, not one big one
&lt;/h2&gt;

&lt;p&gt;A thing worth flagging early, because it shapes the whole design: there is no single "AI" in this system. There are &lt;strong&gt;six narrow LLM jobs&lt;/strong&gt;, each doing one small thing, each with a deterministic fallback so the pipeline runs with zero API keys. Where each call sits:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;idea
 └─► [scriptwriter LLM] ──► timed scenes + narration
        └─► [art-director LLM] picks each scene's motion + look (animator, fx, atmosphere)
              └─► [vision LLM] locates a face's mouth for lip-sync (only on talkinghead)
 visuals → clips → stitch → voice → save
        └─► [SEO LLM] polishes title / description / tags before publish
 (growth loop)
   [ideator LLM] next falsifiable bet (+ web-search trends) → produce → measure →
   [reflector LLM] turns measured results into an updated strategy ─┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Role&lt;/th&gt;
&lt;th&gt;Where&lt;/th&gt;
&lt;th&gt;Job&lt;/th&gt;
&lt;th&gt;Fallback (keyless)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scriptwriter&lt;/td&gt;
&lt;td&gt;&lt;code&gt;stages/script.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;idea → timed scenes + narration&lt;/td&gt;
&lt;td&gt;offline &lt;code&gt;stub&lt;/code&gt; split&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Art director&lt;/td&gt;
&lt;td&gt;&lt;code&gt;artdirect.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;pick per-scene animator / fx / atmosphere / transition&lt;/td&gt;
&lt;td&gt;heuristic rules&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vision / mouth locator&lt;/td&gt;
&lt;td&gt;&lt;code&gt;animate._detect_mouth&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;find a face's mouth (pos + size) for lip-sync&lt;/td&gt;
&lt;td&gt;explicit coords / default&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SEO metadata&lt;/td&gt;
&lt;td&gt;&lt;code&gt;stages/metadata.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;polish title / description / tags&lt;/td&gt;
&lt;td&gt;script-derived&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Ideator&lt;/td&gt;
&lt;td&gt;&lt;code&gt;marketing/ideate.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;next viral bet + trend signals&lt;/td&gt;
&lt;td&gt;strategy seeds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reflector&lt;/td&gt;
&lt;td&gt;&lt;code&gt;marketing/learn.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;measured bets → updated strategy&lt;/td&gt;
&lt;td&gt;top/bottom heuristic&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;And, deliberately, the parts that must be &lt;strong&gt;reproducible and auditable&lt;/strong&gt; are &lt;em&gt;not&lt;/em&gt; LLMs: the explore/exploit &lt;strong&gt;bandit&lt;/strong&gt; (Part 6) is plain Thompson sampling, and &lt;strong&gt;virality scoring&lt;/strong&gt; (Part 5) is a fixed formula. LLMs write and judge &lt;em&gt;taste&lt;/em&gt;; statistics make the &lt;em&gt;decisions&lt;/em&gt;. Keeping that line clean is most of what makes the system debuggable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Watching it actually run
&lt;/h2&gt;

&lt;p&gt;Here's the real log from the Lobachevsky run — note each stage announcing its provider and cost as it goes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;» visuals
visuals 15 images via fal-nanobanana  $0.585
» clips
clips 15 clips via fal-i2v  $0.75
» stitch
stitch 15 clips
» voice
voice captions=burn via edge  $0.0
» save
save runs/lobachevsky/06_final.mp4
done lobachevsky  total $1.335
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fifteen stills, fifteen animated clips, narration, captions, muxed and mastered — &lt;strong&gt;$1.34&lt;/strong&gt;, fully automated, from one line of text. (That run used a bit of paid AI video; the all-Ken-Burns version of the same Short is &lt;strong&gt;$0.585&lt;/strong&gt;, and the cheap-tier playbook from Part 1 gets a similar video to &lt;strong&gt;six cents&lt;/strong&gt;. The cost knobs are Part 4.) Here's a frame from the finished thing:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8kjax2pode3vute8zzvv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8kjax2pode3vute8zzvv.png" alt="A scene from the finished Lobachevsky Short — generated still, free motion, real narration." width="768" height="1344"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And the data shape underneath each scene — the script stage emits timed scenes the rest of the pipeline consumes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="c1"&gt;// 01_script.json (one scene)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"start_s"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"end_s"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"narration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"What if everything you were taught about parallel lines was secretly a lie?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"visual_prompt"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"railroad tracks vanishing toward a glowing question mark, retro poster"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"on_screen_text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"...a lie?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"motion_hint"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"slow push-in toward the vanishing point"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;narration&lt;/code&gt; drives the TTS (and therefore the clip length — audio leads, video follows, so nothing ever desyncs). &lt;code&gt;visual_prompt&lt;/code&gt; drives the image model. &lt;code&gt;motion_hint&lt;/code&gt; drives the free animator. One JSON object, three downstream stages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it yourself (zero API keys, zero dollars)
&lt;/h2&gt;

&lt;p&gt;The repo ships an offline mode so you can watch the whole pipeline run without a single key or cent. Stub providers stand in for the paid ones; everything else is real ffmpeg:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/dasein108/slope-studio
&lt;span class="nb"&gt;cd &lt;/span&gt;slope-studio
uv venv &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;source&lt;/span&gt; .venv/bin/activate
uv pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s2"&gt;".[fal]"&lt;/span&gt;

&lt;span class="c"&gt;# free, offline, end-to-end smoke test:&lt;/span&gt;
studio run &lt;span class="s2"&gt;"how black holes bend time"&lt;/span&gt; &lt;span class="nt"&gt;--duration&lt;/span&gt; 12 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--script-provider&lt;/span&gt; stub &lt;span class="nt"&gt;--image-provider&lt;/span&gt; stub &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--video-provider&lt;/span&gt; kenburns &lt;span class="nt"&gt;--voice-provider&lt;/span&gt; edge
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You'll get a real &lt;code&gt;runs/&amp;lt;id&amp;gt;/&lt;/code&gt; folder with a stitched, narrated &lt;code&gt;06_final.mp4&lt;/code&gt; — built entirely from free local tooling. (Heads up: &lt;code&gt;stub&lt;/code&gt; is a &lt;em&gt;wiring&lt;/em&gt; generator — it emits placeholder text so you can test the plumbing. Swap in a real LLM key before you spend money on visuals, or you'll lovingly render meaningless filler. Ask me how I know.)&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Resist the monolith. Model your AI pipeline as &lt;strong&gt;stages of pure file-to-file functions over a single run directory&lt;/strong&gt;, make each one an independently runnable command, and give every provider a uniform result type that reports its own cost. You get free debuggability (&lt;code&gt;ls&lt;/code&gt; is your inspector), free resumability, free idempotency, and — crucially — a &lt;em&gt;measured&lt;/em&gt; cost ledger that everything smarter you build later (budgets, auto-strategies, bandits) gets to stand on. Boring architecture is a feature.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-3-giving-a-still-image-real-motion-for-000-1a5b"&gt;Part 3: Free Motion&lt;/a&gt;.&lt;/strong&gt; The fun part. AI video is $0.07/second; I'm going to take a single still image and give it real motion — drift, parallax with subject inpainting, kinetic type, atmospheric rain and embers — for &lt;strong&gt;$0.00&lt;/strong&gt;, with a deep dive into the ffmpeg filtergraphs and the indie-game-dev tricks behind them. (Spoiler: it's all already running in the &lt;strong&gt;&lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;live effects gallery&lt;/a&gt;&lt;/strong&gt;.)&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo to follow along:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe to the channel&lt;/strong&gt; to watch the experiment grow from zero: &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the Lobachevsky Short&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>architecture</category>
      <category>video</category>
    </item>
    <item>
      <title>Zero to Autopilot, Part 1: I Built an AI That Runs a YouTube Channel (the landscape, and my $10 wake-up call)</title>
      <dc:creator>Maksims Gavrilovs</dc:creator>
      <pubDate>Fri, 05 Jun 2026 14:59:45 +0000</pubDate>
      <link>https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6</link>
      <guid>https://dev.to/dasein108/zero-to-autopilot-part-1-i-built-an-ai-that-runs-a-youtube-channel-the-landscape-and-my-10-1ki6</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Series:&lt;/strong&gt; &lt;em&gt;Zero to Autopilot — Building a Self-Improving AI Media Channel.&lt;/em&gt; Part 1 of 7. I'm an AI engineer and this is the full build log of an autonomous AI short-video channel — one that writes, renders, publishes, &lt;em&gt;and&lt;/em&gt; decides what to make next, then grades its own homework. No face, no film crew, no me clicking "upload" at midnight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data status (Part 1): real-now.&lt;/strong&gt; Everything below is code, costs, and public facts I can verify today. The juicy audience metrics from my own channel are sandbagged until Part 7, so they have time to become real instead of noise.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslxzml259mt8z9wz6qkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fslxzml259mt8z9wz6qkw.png" alt="A lone figure walking through a crowd of silhouettes — a real frame from my channel, generated for $0.039." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The two-billion-view problem
&lt;/h2&gt;

&lt;p&gt;Late 2025, a channel called &lt;strong&gt;Bandar Apna Dost&lt;/strong&gt; crossed &lt;strong&gt;~2 billion views&lt;/strong&gt; and an estimated &lt;strong&gt;$4.25M/year (~₹38 crore)&lt;/strong&gt;. Its content? Short AI clips of a monkey and a Hulk-ish dude. No dialogue. No plot. No discernible reason to exist. (&lt;a href="https://www.techlusive.in/news/how-this-indian-ai-generated-youtube-channel-is-pulling-billions-of-views-and-millions-in-revenue-1635923/" rel="noopener noreferrer"&gt;techlusive&lt;/a&gt;, &lt;a href="https://www.business-standard.com/technology/tech-news/india-youtube-bandar-apna-dost-channel-global-ai-video-charts-slop-content-125123100396_1.html" rel="noopener noreferrer"&gt;Business Standard&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Cue every dev's reaction: &lt;em&gt;"...I have a GPU and zero shame, how hard can this be?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Pretty hard, actually — because here's the part the get-rich-quick threads leave out. A few months later YouTube's &lt;strong&gt;"AI slop" crackdown&lt;/strong&gt; nuked an estimated &lt;strong&gt;4.7 billion views across 16 channels&lt;/strong&gt;, ~35M subs, and nearly &lt;strong&gt;$10M in revenue&lt;/strong&gt;. Among the bodies: &lt;strong&gt;Three Minute Wisdom&lt;/strong&gt;, a ~1.7M-sub / ~2B-view faceless AI channel, most of its catalog vaporized. (&lt;a href="https://outlierkit.com/resources/youtube-ai-slop-crackdown-2026/" rel="noopener noreferrer"&gt;OutlierKit&lt;/a&gt;, &lt;a href="https://miraflow.ai/blog/faceless-youtube-channel-explosion-ai-million-subscriber-creators-2026" rel="noopener noreferrer"&gt;Miraflow&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;So the lay of the land in mid-2026:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Faceless AI video is a &lt;strong&gt;real, monetizable&lt;/strong&gt; category. Billions of views, real revenue, nobody's face required.&lt;/li&gt;
&lt;li&gt;It's also a &lt;strong&gt;ban speedrun&lt;/strong&gt; if you ship slop. The platforms are now actively &lt;code&gt;rm -rf&lt;/code&gt;-ing low-effort content at scale.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I looked at that and saw a clean engineering problem with two non-negotiable constraints: &lt;strong&gt;don't make slop, and don't go broke making it.&lt;/strong&gt; This series is me brute-forcing both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "faceless" is catnip for an engineer
&lt;/h2&gt;

&lt;p&gt;Faceless means &lt;strong&gt;narration + visuals do all the work.&lt;/strong&gt; No on-camera talent, no lighting rig, no "can you do Tuesday?" Every input is a file that an LLM or a model can spit out. Which means the whole thing is &lt;em&gt;programmable&lt;/em&gt; — and anything programmable can be measured, costed, and (eventually) left to run while you sleep.&lt;/p&gt;

&lt;p&gt;The winning recipe is boringly well-documented: pick a niche, nail a 2-second hook, stay on-brand, keep people watching to the end, and build a deep library so the algorithm has something to binge-feed. Notice what's &lt;em&gt;not&lt;/em&gt; on that list: a human, per video. That's a &lt;strong&gt;system&lt;/strong&gt;, not a craft.&lt;/p&gt;

&lt;p&gt;The channels getting deleted skipped the system and cranked the volume knob to 11. The survivors — and the non-AI GOATs like Kurzgesagt and CrashCourse — win on structure, pacing, and actually having a point. My bet: an engineer can clear that quality bar &lt;em&gt;and&lt;/em&gt; the volume bar &lt;strong&gt;if&lt;/strong&gt; each video is cheap enough to run hundreds of experiments, with a learning loop deciding which ones to rerun.&lt;/p&gt;

&lt;h2&gt;
  
  
  Exhibit A: my first video quietly ate $10
&lt;/h2&gt;

&lt;p&gt;Here's video #1, live on the channel — Lobachevsky, the guy who broke geometry:&lt;/p&gt;

&lt;p&gt;🎬 &lt;strong&gt;&lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;The heretic who broke geometry → youtube.com/shorts/gaR76MiAK0U&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I did the rookie thing: reached for &lt;strong&gt;AI image-to-video on every single scene&lt;/strong&gt;, because that's what the shiny demos show. It looked great. Then I checked the bill.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ten dollars. One Short.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The villain is one line of arithmetic — hosted AI video is priced &lt;strong&gt;per second&lt;/strong&gt;, not per clip:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# studio/providers/video.py — real per-second prices (verified on fal.ai, June 2026)
&lt;/span&gt;&lt;span class="n"&gt;FAL_MODELS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;kling&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.07&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;   &lt;span class="c1"&gt;# 150s Short ≈ $10.50   &amp;lt;-- oof
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ltx&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.04&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;   &lt;span class="c1"&gt;# cheapest hosted i2v
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;seedance&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.30&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;   &lt;span class="c1"&gt;# 150s ≈ $45 (lol no)
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hailuo&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.045&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;per_s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.16&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;150 seconds × $0.07 = &lt;strong&gt;$10.50&lt;/strong&gt;, no matter how you slice the clips. Now do the napkin math on a content &lt;em&gt;strategy&lt;/em&gt;: at ~$10/video, a hundred experiments is a thousand bucks, and you cannot run a "post a lot and learn" loop you can't afford to repeat. The economics were quietly DOA.&lt;/p&gt;

&lt;h2&gt;
  
  
  Plot twist: I'd solved this before, in a past life
&lt;/h2&gt;

&lt;p&gt;Before AI ate my career, I shipped indie games. And indie game dev is a master class in &lt;strong&gt;faking expensive things for free&lt;/strong&gt;, because you've got a $0 art budget and a build due Saturday. You don't buy motion — you &lt;em&gt;engineer the feeling&lt;/em&gt; of motion: parallax scrolling layers, drifting backgrounds, snappy cuts, a little camera push. Cheap tricks, real game-feel.&lt;/p&gt;

&lt;p&gt;Same energy, new domain. Why pay $10.50 for AI video when I can take &lt;strong&gt;one still image&lt;/strong&gt; and add:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;drift / Ken-Burns&lt;/strong&gt; — slow pan + zoom, the still breathes;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;parallax&lt;/strong&gt; — split the frame into depth planes and slide them at different speeds (the background literally drifts behind a static subject);&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cuts &amp;amp; transitions&lt;/strong&gt; — rhythm beats AI motion for retention anyway.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All in ffmpeg. All free. That's the entire Part 3 of this series, and it's where most of the $10 goes to die. Spoiler: it does &lt;strong&gt;not&lt;/strong&gt; look like slop —&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhfbrm0rmlztmteu16ju.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvhfbrm0rmlztmteu16ju.png" alt="Noir woodcut village, lone figure — a single $0.039 Nano Banana still, animated for free." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkb6sn4j9lk21cbj4cas5.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkb6sn4j9lk21cbj4cas5.png" alt="A man reading in an empty train carriage — same pipeline, free motion. The art *direction* is what kills the slop vibe, not an expensive model." width="800" height="457"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(These stills don't move on the page — but every free effect is playing live in the &lt;strong&gt;&lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;effects gallery&lt;/a&gt;&lt;/strong&gt;. Drift, parallax, rain, embers, glitch, all $0. Part 3 dissects how.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Exhibit B: the six-cent video
&lt;/h2&gt;

&lt;p&gt;Killing AI video was step one. Step two was realizing &lt;strong&gt;Nano Banana isn't always the move.&lt;/strong&gt; For a goofy "why do cats have fur" Short, I didn't need photoreal noir — I needed clean flat cartoon. Enter &lt;strong&gt;Flux Schnell&lt;/strong&gt; at &lt;strong&gt;$0.003 per megapixel&lt;/strong&gt;, roughly &lt;strong&gt;half a cent an image&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobcgq8u38r1guruxbz1w.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fobcgq8u38r1guruxbz1w.png" alt="A Flux Schnell cat — about $0.005 to generate. Right tool, right price." width="800" height="1400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Here's that one, live:&lt;/p&gt;

&lt;p&gt;🎬 &lt;strong&gt;&lt;a href="https://www.youtube.com/shorts/FWtEJjeK_vI" rel="noopener noreferrer"&gt;Why do cats have fur? → youtube.com/shorts/FWtEJjeK_vI&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And the receipts, straight from its manifest:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Script&lt;/td&gt;
&lt;td&gt;local LLM&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visuals (10 images)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;fal-flux-schnell&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.054&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Motion (all scenes)&lt;/td&gt;
&lt;td&gt;Ken-Burns (ffmpeg)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Voice&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;edge-tts&lt;/code&gt; (neural)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sound FX + music&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;fal-elevenlabs-sfx&lt;/code&gt; + local bed&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.0076&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Save + Publish&lt;/td&gt;
&lt;td&gt;ffmpeg / YouTube API&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;≈ $0.06&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;From &lt;strong&gt;$10.50 → six cents.&lt;/strong&gt; Same pipeline, different knobs. That's a &lt;strong&gt;~175× cost cut&lt;/strong&gt;, and it's the difference between "fun demo" and "I can run hundreds of these and let a bandit pick the winners." (Full cost teardown: Part 4.)&lt;/p&gt;

&lt;p&gt;That &lt;code&gt;$0.0076&lt;/code&gt; line is quietly important, too: it's an &lt;strong&gt;AI sound layer&lt;/strong&gt; — generated SFX plus a music bed ducked under the narration — and atmosphere is a big reason cheap doesn't read as &lt;em&gt;slop&lt;/em&gt;. The how is in Part 3.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gap I'm actually building into
&lt;/h2&gt;

&lt;p&gt;After mapping the field, two things were suspiciously absent from every faceless-AI playbook:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Cost honesty.&lt;/strong&gt; Everyone screenshots the $4M. Nobody publishes a per-second price table or admits their first video cost $10. So they never explain how to afford video #100.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomy.&lt;/strong&gt; "Just post consistently for 6 months" — cool, that's a full-time job done by hand. Nobody treats &lt;em&gt;what to make next&lt;/em&gt; as a decision a system can learn: explore vs. exploit, a memory of what won, a verdict on every bet.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the thesis. Over the next six parts I'll build a channel that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;turns a one-line idea into a finished, well-directed vertical Short (&lt;strong&gt;Part 2&lt;/strong&gt;),&lt;/li&gt;
&lt;li&gt;moves nearly all motion off paid AI video onto &lt;strong&gt;free custom effects&lt;/strong&gt; (&lt;strong&gt;Part 3&lt;/strong&gt;),&lt;/li&gt;
&lt;li&gt;drives cost per video from ~$10 toward &lt;strong&gt;pennies&lt;/strong&gt; (&lt;strong&gt;Part 4&lt;/strong&gt;),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;remembers&lt;/strong&gt; what worked via a per-channel journal + self-reflection (&lt;strong&gt;Part 5&lt;/strong&gt;),&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;decides&lt;/strong&gt; what to make next with a Thompson-sampling bandit over a &lt;em&gt;falsifiable&lt;/em&gt; hypothesis (&lt;strong&gt;Part 6&lt;/strong&gt;),&lt;/li&gt;
&lt;li&gt;and &lt;strong&gt;runs itself&lt;/strong&gt; on a schedule, grading each post 48–72h later (&lt;strong&gt;Part 7&lt;/strong&gt;).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The learning loop is already showing its teeth. A batch of near-identical clips dumped in the same minute cannibalized itself (3–6 views each — brutal). Meanwhile one video — a real mathematician framed as a heretic, with a "this breaks reality" hook in the first two seconds — hit roughly &lt;strong&gt;50× the channel's other Shorts.&lt;/strong&gt; The rest of this series is the machine I'm building so that's a &lt;em&gt;repeatable pattern&lt;/em&gt;, not a lucky roll.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's all open source — and it's a live experiment
&lt;/h2&gt;

&lt;p&gt;The whole studio is on GitHub — &lt;strong&gt;&lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;&lt;code&gt;slope-studio&lt;/code&gt;&lt;/a&gt;&lt;/strong&gt; (one letter from "slop", which, given the genre, is either a typo or a mission statement). Every line of code in this series lives there: the 7-stage pipeline, the free ffmpeg effects, the cost model, the bandit. Part 2 is the guided tour, with a one-command smoke test you can run with zero API keys.&lt;/p&gt;

&lt;p&gt;And this isn't a retrospective with the numbers airbrushed in — it's a &lt;strong&gt;live experiment&lt;/strong&gt; you can watch compound or faceplant in public. Every Short the system ships asks viewers to subscribe, because the whole point is watching an autonomous channel grow from zero. Consider it subscribing to the test harness.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd tell another AI engineer
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; Treat content as a pipeline, not a craft. The instant every input — script, image, motion, voice, sound — is a function call with a &lt;em&gt;measured&lt;/em&gt; cost, three superpowers unlock: you can drive unit cost toward zero, run hundreds of cheap experiments, and bolt a learning loop on top that decides which experiments to repeat. The folks making millions optimized the system and the volume. The folks getting deleted &lt;em&gt;only&lt;/em&gt; had volume. The alpha is the system.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Next — &lt;a href="https://dev.to/dasein108/zero-to-autopilot-part-2-one-line-of-text-a-published-short-in-7-stages-inp"&gt;Part 2: Idea&lt;/a&gt; → Published in 7 Stages.&lt;/strong&gt; The actual architecture: every stage as an independent CLI subcommand, the &lt;code&gt;runs/&amp;lt;id&amp;gt;/&lt;/code&gt; artifact flow, a manifest that records measured cost per stage, and how a single line of text becomes an uploaded Short without me touching a video editor.&lt;/p&gt;

&lt;p&gt;▶ &lt;strong&gt;Live effects gallery:&lt;/strong&gt; &lt;a href="https://dasein108.github.io/slope-studio/" rel="noopener noreferrer"&gt;dasein108.github.io/slope-studio&lt;/a&gt;&lt;br&gt;
⭐ &lt;strong&gt;Star the repo:&lt;/strong&gt; &lt;a href="https://github.com/dasein108/slope-studio" rel="noopener noreferrer"&gt;github.com/dasein108/slope-studio&lt;/a&gt;&lt;br&gt;
🔔 &lt;strong&gt;Subscribe&lt;/strong&gt; (watch the experiment from zero): &lt;a href="https://www.youtube.com/shorts/gaR76MiAK0U" rel="noopener noreferrer"&gt;the Lobachevsky Short&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://www.techlusive.in/news/how-this-indian-ai-generated-youtube-channel-is-pulling-billions-of-views-and-millions-in-revenue-1635923/" rel="noopener noreferrer"&gt;techlusive&lt;/a&gt; · &lt;a href="https://www.business-standard.com/technology/tech-news/india-youtube-bandar-apna-dost-channel-global-ai-video-charts-slop-content-125123100396_1.html" rel="noopener noreferrer"&gt;Business Standard&lt;/a&gt; · &lt;a href="https://outlierkit.com/resources/youtube-ai-slop-crackdown-2026/" rel="noopener noreferrer"&gt;OutlierKit (AI-slop crackdown)&lt;/a&gt; · &lt;a href="https://miraflow.ai/blog/faceless-youtube-channel-explosion-ai-million-subscriber-creators-2026" rel="noopener noreferrer"&gt;Miraflow (faceless explosion 2026)&lt;/a&gt;. View/revenue figures are third-party estimates.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>python</category>
      <category>video</category>
    </item>
  </channel>
</rss>
