Lars Winstand

Posted on Jun 27 • Originally published at standardcompute.com

I thought a 24/7 life-ops agent would be one genius bot but it’s actually 10 boring ones

#ai #agents #automation #devops

I started this rabbit hole expecting sci-fi.

You know the pitch: one always-on agent on a Mac mini or home server, quietly running your life while you sleep. It fixes your Plex library, manages Home Assistant, plans trips, handles admin, watches RSS, and only pings you when something actually matters.

Then I read a thread on r/openclaw about a guy doing exactly this for a media server + personal ops setup.

And the interesting part was not the fantasy.

It was the architecture.

The setup used very normal tools: Unraid, Plex, Sonarr, Radarr, FileBot, Home Assistant, archive.org, Discord, Telegram. The agent wasn’t doing movie-trailer-demo intelligence. It was doing background work. Constantly.

That phrase stuck with me: background work.

That’s the real design constraint for 24/7 agents.

Not reasoning benchmarks.
Not AGI vibes.
Not whether GPT-5.4 or Claude Opus 4.6 wins one-shot prompts.

The hard part is building something that can grind through boring tasks all day without turning into either:

a state-management disaster
a billing disaster

And the first surprise is that the best version is usually not one agent.

It’s several small ones.

The winning pattern is not a super-agent

One of the best comments in that OpenClaw thread came from someone running roughly ten agents, with about six active daily, each with a narrow role.

That sounds less impressive than “I built Jarvis.”

It also sounds much more correct.

If you’re building personal ops, home-lab automation, or always-on assistants, the architecture looks more like a tiny ops team than a single autonomous brain.

Something like this:

An inbox/operator agent for triage and final decisions
A media agent for Plex, Sonarr, Radarr, FileBot, subtitle cleanup, missing episodes
A home agent for Home Assistant routines and device actions
A research agent for web lookups, archive.org pulls, ancestry, travel planning
An admin agent for reminders, summaries, follow-ups, recurring tasks
A notification layer that only escalates real interruptions to Telegram

That maps cleanly to how tools like OpenClaw are actually useful.

OpenClaw’s self-hosted Gateway acts like a control plane. Sessions stay isolated by agent, workspace, or sender across channels like Discord and Telegram.

That sounds like an implementation detail.

It’s not.

For long-running agents, session isolation is survival.

If your media cleanup task bleeds into your nonprofit fundraising draft, or your Home Assistant routine inherits context from a half-finished archive.org job, the whole system starts acting haunted.

Developers usually discover this the hard way: long-running agents stop being a prompt problem and start being an operations problem.

That’s true whether you use OpenClaw, n8n, Make, Zapier, or a custom Python worker farm.

Why these setups feel smart for a week and cursed by week three

A lot of people blame the model when their agent stack starts getting weird.

Usually it’s not the model.

It’s state drift.

The original Reddit post described exactly the kind of failure you see in real agent systems:

project lists drifting away from “waiting on me” lists
completed tasks reappearing
items vanishing
background workers getting timid or inconsistent

That’s not “LLMs are fake.”

That’s “you have no durable source of truth.”

One commenter said they fixed this by adding a shared memory/store underneath their lists so different views stopped disagreeing.

That’s why task state matters more than people think.

The board is the product

One of the least flashy and most important ideas in OpenClaw is Workboard.

Not because boards are exciting.

Because persistent agents need a ledger.

A real one.

If an agent drafts a reply but never sends it, should the task be done?
If a worker retries three times and fails, where do you see that?
If an alert fired at 3:14 AM, what run produced it?
If a session goes stale, how do you know what was in progress?

You need visible state tied to logs, run IDs, session IDs, retries, and event history.

That’s the difference between:

“my agent feels magical”
and “my agent can survive contact with reality”

For always-on agents, boards, logs, retries, and stale-session detection matter more than demo quality.

A practical life-ops stack is mostly boring software

This was my favorite part of the research.

The stack is not exotic.

It’s home-lab software with automation surfaces.

Media stack

The Reddit example used:

Unraid
Plex
Sonarr
Radarr
FileBot
live TV channels

That’s already enough surface area for a useful agent.

A media agent does not need cinematic taste.

It needs to:

detect broken naming
rename files correctly
notice missing episodes
fetch metadata/subtitles
escalate edge cases

This kind of command is more useful than 90% of “AI agent” demos:

filebot -rename -r "/input" \
  --db TheMovieDB::TV \
  -non-strict \
  --action duplicate \
  --output "/output" \
  --format "{plex.id}"

That’s real work.

Home automation

Home Assistant already has an OpenAI integration and can control exposed entities through Assist.

That’s powerful.

It’s also telling that the docs explicitly warn users to monitor API usage and set limits.

That warning is not a footnote. It’s a design signal.

Always-on automation creates lots of small calls.

Research and archive tasks

The same Reddit setup included:

archive.org downloads
ancestry research
backpacking trip planning
concert alerts
RSS monitoring

Again: normal tasks.

The internetarchive Python library already gives you a clean automation surface.

Example:

from internetarchive import search_items

query = 'collection:opensource_movies AND subject:"documentary"'
for item in search_items(query):
    print(item["identifier"])

Discord works well for conversational interaction.
Telegram works better for high-priority alerts because it feels distinct from general chat.

Nothing here is futuristic.

That’s why it’s credible.

The expensive part is not brilliance. It’s idling.

This is the part more devs should care about.

Persistent agents don’t get expensive because they’re doing one huge, brilliant task.

They get expensive because they never stop doing small tasks:

polling
summarizing
retrying
classifying
checking state
routing messages
rewriting outputs
generating alerts
logging

That’s where token anxiety comes from.

Not one giant prompt.

A thousand tiny background calls.

And this is where the compute model matters much earlier than most people expect.

If you have 3 to 10 workers doing low-grade activity all day, predictable monthly compute matters more than shaving pennies off a single prompt.

That’s true for OpenClaw.
It’s true for n8n.
It’s true for Make.
It’s true for Zapier.
It’s true for custom worker fleets.

Once you have agents running 24/7, per-token pricing becomes annoying in a very specific way: the expensive part is ambient traffic you stop noticing.

That’s exactly why flat-rate AI is compelling for agent workloads.

A drop-in OpenAI-compatible API with predictable monthly pricing is just a better fit for always-on systems than babysitting token burn across thousands of tiny calls.

That’s the core reason Standard Compute is interesting here.

If you’re building agent-heavy automations, Standard Compute gives you unlimited AI compute at a flat monthly price, works with OpenAI-compatible SDKs and HTTP clients, and removes the need to constantly meter background activity. For persistent workers, retries, summaries, and routing loops, that model makes more sense than per-token billing.

Not because “unlimited” sounds flashy.

Because boring background work is exactly what agents do most.

Which tool is actually best for what?

Not every part of this job belongs in the same interface.

My take is pretty simple:

Option	What it’s actually best for
OpenClaw	Best control plane for long-running personal ops: self-hosted Gateway, multi-channel access through Discord and Telegram, isolated agent sessions, and task tracking tied to ongoing work
Home Assistant + direct OpenAI integration	Best for controlling exposed entities and home routines, but weaker for multi-agent coordination because device control is only one part of the system
Claude Code or Codex	Best for code-heavy tasks, upgrades, debugging, and direct developer workflows where you want stronger hands-on execution
n8n / Make / Zapier	Best for structured workflow automation, SaaS integrations, and event-driven pipelines, but they still need good state management once AI workers run continuously

If I need a control plane for personal ops across Discord, Telegram, and long-running task state, I’d pick OpenClaw over direct Home Assistant + OpenAI.

If I need code edits, debugging, or developer execution, I’d pick Claude Code or Codex.

If I need integration-heavy pipelines, I’d use n8n or Make.

The mistake is assuming one tool should dominate the whole stack.

What I’d build first

If I were building this at home, I would start smaller than the Reddit dream.

Three agents, not ten.

First-pass architecture

OpenClaw Gateway on a Mac mini, VM, or home server
Discord for normal interaction
Telegram only for high-priority alerts
One media agent
One home agent
One admin agent
Workboard enabled from day one
Direct scripts/APIs for execution
GPT or Claude for planning/summarization

That last point matters.

Use LLMs for planning, summarization, classification, and communication.
Use deterministic tools for execution.

Examples:

FileBot CLI for file operations
Home Assistant actions for device control
Python scripts for archive.org tasks
Cron/systemd/timers/queue workers for scheduling

OpenClaw bootstrap

npm install -g openclaw@latest
openclaw onboard --install-daemon

Enable Workboard:

openclaw plugins enable workboard
openclaw gateway restart
openclaw dashboard

Example worker split

agents:
  media:
    responsibilities:
      - plex_health_checks
      - sonarr_radarr_exceptions
      - filebot_renames
      - subtitle_cleanup
    notify: telegram_on_blockers

  home:
    responsibilities:
      - morning_summary
      - failed_automation_retries
      - device_state_checks
    notify: telegram_on_safety_issues

  admin:
    responsibilities:
      - inbox_triage
      - reminders
      - follow_up_lists
      - daily_digest
    notify: discord_default

Example execution pattern

Keep the LLM out of shell execution as much as possible.

import subprocess

def rename_media(path_in: str, path_out: str):
    cmd = [
        "filebot",
        "-rename",
        "-r", path_in,
        "--db", "TheMovieDB::TV",
        "-non-strict",
        "--action", "duplicate",
        "--output", path_out,
        "--format", "{plex.id}",
    ]
    return subprocess.run(cmd, capture_output=True, text=True)

The model should decide when to call this.
It should not freestyle the command every time.

The real lesson: boring beats autonomous

The strongest pattern in these life-ops setups is almost annoying in how unglamorous it is.

The winner is not one dazzling autonomous agent.

It’s a stack of narrow workers doing tiny jobs reliably, with one operator in the middle and a task board keeping everyone honest.

That has two immediate implications for developers:

Treat long-running agents like ops systems, not chat sessions
Pick a compute model that can tolerate constant low-grade traffic

If your setup includes Plex, Home Assistant, archive.org, Discord, Telegram, RSS, and all the weird admin tasks that pile up around real life, I’d optimize in this order:

State hygiene
Session isolation
Predictable compute

Everything else comes after that.

Because the dream is not an agent that feels magical for one weekend.

It’s an agent that quietly handles boring work for months without wrecking your task state or making you afraid to check your API bill.

And if you’re already building this kind of thing with OpenAI-compatible tooling, n8n, Make, Zapier, OpenClaw, or custom workers, this is exactly where Standard Compute fits: flat-rate AI compute for always-on agent systems that do lots of small legitimate work all day.

That’s a much better foundation than pretending your background loops are free.

DEV Community