Maksims Gavrilovs

Posted on Jul 2

Zero to Autopilot, Part 8: The $25 Company — an Org of AI Agents That Runs My Channel

#ai #automation #python #agents

Series: Zero to Autopilot — Building a Self-Improving AI Media Channel. Part 8. Part 1 built the channel; Parts 2–7 made it run itself (finale: Part 7). This part replaces that single loop with a company of agents that manages the channel.

Data status: real-now — costs, code, architecture, and qualitative outcomes, all measured today. Repo is open source.

I opened the dashboard one morning expecting nothing, and three videos had already shipped overnight. Scripts written, scenes rendered, voiced, captioned, QA'd, published to YouTube. I hadn't touched anything. There was no notification waiting for me either, because at no point did the work need a human. The 9am job had fired, a handful of agents passed the job between themselves, and by the time I looked it was done.

That's the system I want to describe. Not the videos, the org that makes them.

What this is

I run a faceless AI YouTube Shorts channel. Software does the whole thing: it picks a topic, writes a 60-second script, generates the keyframes and motion, synthesizes a voiceover, mixes audio, and uploads. The whole operation runs on about $25 a month, and it's structured like an actual company, with a CEO, a growth lead, a QA critic, a producer, the works.

For a while the brain of it was a single loop, one function on a timer that walked through ideate → produce → measure → learn and picked the next action each tick. It worked. It also had no judgment.

One process did everything, which meant nothing checked anything. The thing that wrote the script was the same thing that decided the script was good enough to spend money rendering. It never noticed a bug or re-thought a budget. It ran its if-statements and stopped there.

So I replaced the loop with a company.

The one goal

Everything below serves a single number. The company has one goal, the kind you'd give a real team: cross YouTube's monetization bar (1,000 subscribers plus the watch-time threshold). The CEO agent owns it; the Growth Lead works toward it. Every video is a bet placed against that goal, and every measurement is scored relative to it. How a goal like that actually turns into today's three video ideas is the interesting part, and it's the subject of Part 9. For now: there's a goal, and the org exists to move it.

The company

The brain is now eight role-specialized LLM agents running on Paperclip, a small local runtime that gives each agent an identity, a task inbox, and the ability to hand work to another agent. They don't chat in a free-for-all. They pass tickets, like a real team.

Here's what each one actually owns:

CEO / Operator. The board. Owns the company goal (for me, unlocking monetization), the budget caps, and the publishing policy. It approves spend and policy changes and makes the final call on direction. The other agents escalate decisions here; it doesn't write or render anything itself. (Sonnet 4.6.)
Growth Lead. The initiator, and the closest thing to a manager. Every cycle it reads the channel's state and decides the single most useful move right now: make something, measure matured videos, or reflect and update strategy. When it's "make," it picks which bet from the backlog and writes the SEO framing (title candidate, hook promise, target keyword), then hands the bet to the Screenwriter. (Sonnet 4.6.)
Screenwriter. Turns a one-line bet into a real script: a hook that lands in the first three seconds, one concrete idea actually explained, and visual prompts the renderer can use. If QA sends it back, it rewrites against the specific notes. (Opus 4.8, because script quality is the product.)
QA / Critic. The independent gate, and the reason I trust the thing. It runs twice: once on the script before any money is spent (is the hook real, is the payoff there, does the title overpromise, are the prompts safe to render), and once on the final video before publish (does it play, is the audio clean, does the metadata match). It can block either gate and send the work back. (Opus 4.8.)
Producer. Turns a passed script into a finished, published video. It runs the whole render pipeline (keyframes, motion clips, stitching, sound, voice, master, metadata), publishes to YouTube, and links the result back to the channel's journal so it can be measured later. It works inside the budget cap. (Sonnet 4.6.)
Analytics & Learning. Closes the loop. It waits for videos to mature (~60 hours), pulls the real YouTube numbers, scores each bet's virality relative to the channel's own history, and rewrites the strategy: what's winning, what's losing, what to try next. (Sonnet 4.6.)
Observability / Ops. The watchdog. Looks for stuck tickets, failed renders, published videos that never got linked, and budget drift, then opens incidents. (Haiku 4.5.)
Secretary. A daily Telegram digest so I can read the state of the company without opening anything. (Haiku 4.5.)

The model tiering is deliberate and it's a cost decision: Opus only where judgment is the product (writing and gating), everything coordinational on cheaper models. And because the agents run on a Claude subscription rather than metered API calls, the reasoning is effectively free. The only thing that costs real money is the AI video generation itself.

How a day runs

There is exactly one timer left in the system: a 9am job that wakes the Growth Lead. Everything after that is event-driven. Finishing one step assigns the next ticket to the next agent, and being assigned a ticket is what wakes that agent. A comment doesn't wake anyone; the assignment does.

Here's what that looks like in practice. These are the actual tickets for a single Short, "Hilbert's Infinite Hotel," from idea to measured:

Ticket	Real task name	Assignee
SLO-80	Script: j0034 — Hilbert's Infinite Hotel: The Paradox That Breaks Infinity	QA Critic
SLO-82	Produce: j0034 — Hilbert's Infinite Hotel (paid stages authorized)	QA Critic
SLO-85	Packaging/SEO gate: j0034 — Hilbert's Infinite Hotel	Growth Lead
SLO-86	Publish approval: j0034 — Hilbert's Infinite Hotel	CEO Operator
SLO-87	Publish: j0034 — Hilbert's Infinite Hotel	Producer
SLO-97	Measure + learn: j0033, j0034, j0036	Analytics & Learning

(The assignee is the agent that owned the ticket when it closed; a "Script" ticket finishes assigned to QA because that's who it was handed to for the gate. The numbers skip around because two sibling Shorts moved through the same morning's cycle in parallel: SLO-81, 83, and 84 belong to "The Arrow of Time," which the cron produced the same day.) Six tickets, six handoffs, none of which needed me.

How it decides what to make

This part isn't an LLM guessing. The scouting is mostly statistics.

Every published video is recorded as a falsifiable bet with a measured outcome. Analytics turns the winners and losers into explicit patterns and idea seeds. A plain Thompson-sampling bandit sits over the learned theme and format features and decides, per slot, whether to exploit a known winner or explore something new. Growth Lead takes the bandit's pick plus the learned patterns and shapes the actual bet. The creative agent only enters at the end, working from evidence rather than vibes.

What the evidence said for my channel: tragic-genius stories wrapped around a paradox, in math and physics, win. Melancholy, horror, and demographic quiz formats lose. The system figured that out from its own measured history, not from me.

What I control vs what the agents do

The clean line, because it's the whole point:

What I control (the levers):

The goal (what "winning" means: subscribers, watch time, monetization).
Budget caps: per-video and daily spend.
Promotion velocity: how many videos per day, how aggressively to push.
Publishing policy: attended (I approve each) vs unattended.
Content vision and guardrails: themes to chase or avoid.
Strategic development (curation): the occasional analysis brief that reshapes what the company makes.

What the agents do on their own:

Scout the next topic (bandit + learned patterns).
Write and rewrite scripts.
Gate quality at the script and the final cut.
Render, voice, publish, and link each video.
Measure real performance and rewrite the strategy.
Notice stuck work and fix their own coordination.

I set the rules of the game. They play it.

Why I barely touch it

Standing the company up was a one-time cost, and a small one. I declared it in a single config: the company and its goal (cross monetization), the eight agents with their roles, models, and prompts, one cron routine (the 9am wake), and the policy knobs — budget caps, promotion velocity (how many videos a day, how hard to push), and whether publishing is attended or not. Paperclip read that file, created the agents and the goal tree, and the org existed. No per-agent babysitting, no wiring handoffs by hand; the handoffs are just tickets the agents pass among themselves.

Then I was heads-down on other work, and travelling for a few days. Worth being honest about one fragility here: Paperclip runs on my laptop, so when the lid is shut the 9am cron doesn't fire. While I was away the channel missed a few days of publishing — the price of hosting your autonomous company under your own desk instead of on a server. The rest of the week it ran without me.

Across that week I stepped in by hand three times, and every touch was direction, not labor:

I adjusted the content vision — moved to three videos a day, with one slot reserved as a deliberate experimental bet so the bandit always keeps exploring, not just exploiting. I left it as a comment on the policy task and the Growth Lead translated it into how it picks bets.
I rebranded the channel — an SEO-driven rename from Starship Pilot to Paradox Noir, to match what the data said was winning: paradoxes, told dark.
I asked it to study its own failures, and it re-tooled production. The most recent one, filed just this morning: a one-line brief — analyze the unpopular videos and find the common patterns — and I went back to other work. The Analytics agent ranked every loser and came back with seven loss patterns: off-brand genres failed without exception, a mechanism with no named human lost, vague stakes lost, and, against my own instinct, the videos carrying the most visual effects flopped hardest. The CEO agent turned that into policy on its own: a genre kill-list, a Kling-over-LTX rule, zero effects by default, and a mandatory screenwriter checklist — a named human in the first three seconds, one concrete outcome, a rejection-or-vindication arc, a keyword-first title. It wrote the rules into fresh tickets for the Screenwriter and the Producer and closed the loop without me.

Note the shape of that third one. I didn't run the analysis or write the rules; I asked one question and the company rewrote how it writes and produces. That's the management I actually do now: set the goal, now and then point at a weakness, read what comes back. Scouting, scripting, QA, rendering, publishing, measurement — none of it needs me. I check in to steer and to read, not to drive.

Receipts

Small channel, honest numbers. To date the company has logged 53 bets, produced 43 videos, measured 41, for about 5,388 views and 22 new subscribers. Nobody's quitting their day job. But the distribution is the interesting part:

Video	Views	Retention	Subs	Note
"The mathematician who proved a theorem decades too early"	523	70%	7	best — 100th percentile
"The Cat That Is Alive AND Dead: Superposition"	667	23%	3	most views, but reach without conversion

The cat video got the most eyeballs and taught the least: 23% retention, weak subs. The "genius ahead of his time" video got fewer views but held 70% of them and converted seven subscribers. The company now knows the difference, and the bandit weights toward the second kind.

On cost, the math is the whole pitch. A clean Short runs 7 to 25 cents. Across 21 days the channel spent $13.30 generating 39 videos, a run-rate of about $19 a month at three Shorts a day. The agents' reasoning doesn't add to that, it rides a flat Claude subscription instead of metered API calls, so video generation is the only real spend. Call it roughly $25 a month with headroom, for a company that writes, judges, and ships on its own.

Exactly one video blew past the per-video cap: $2.81. Worth describing why, because it's the kind of thing autonomy quietly does to your wallet. The render pipeline defaults its spend cap to $3 when no cap is passed, and the produce step didn't pass the channel's much tighter budget. So nothing stopped the Producer from generating premium AI video (the costliest model, around $0.31 a scene) on all nine scenes of the Short. Nine premium clips, about $2.80, for one 114-second video. When I looked at the data, spend and virality were negatively correlated: the expensive model wasn't measurably better, it just emptied the budget faster. The fix was to make every channel run derive its cap from the budget automatically, so the $3 default can never apply again.

That's the whole system: a goal, a budget, a cadence, and eight agents that pass tickets until something ships. I set the rules and read the dashboard. The channel does the rest.

Part 9 — *Anatomy of a $25 AI Company* goes inside the machine: how the goal becomes tasks, the two kinds of memory the company runs on, how an agent decides what to do when it wakes, and why the guardrails have to live in code and not in a prompt. (Link when published.)

It's all open source, the company package, the agents, the render pipeline, the bandit. Go read it, fork it, or point out what I got wrong.

⭐ Repo: github.com/dasein108/slope-studio
▶ Live effects gallery: dasein108.github.io/slope-studio
📚 This is a continuation of the Zero to Autopilot series — start at Part 1: I built an AI that runs a YouTube channel, which covers the channel, the pipeline, the cost collapse, the memory, and the bandit, everything this piece builds on.