How I Built an AI-Run SaaS With 4 LLM-Powered Executives (Tutorial)

#ai #saas #automation #indiehackers

TL;DR — Four cron-scheduled LLM workflows act as autonomous "executives" for content/SEO, cold outreach, revenue ops, and weekly retrospectives. Together they offload roughly 130 daily tasks from a solo founder, run on a $15/month VPS, and report into Telegram. Below is the architecture, the prompt structure, and a runnable Python skeleton.

Why four executives, not one mega-agent?

Single-agent setups have one giant prompt that has to know everything. They drift, hallucinate context, and get expensive. Splitting work into bounded responsibilities with their own prompt, memory file, and cron schedule has three benefits:

Cheap retries. A failed retrospective doesn't blow up your outreach pipeline.
Auditable state. Each executive owns one folder, one log, one memory file.
Token economy. Smaller prompts + targeted RAG = 60-70% lower spend versus a kitchen-sink agent.

The four roles I landed on after iterating for ~6 months:

Executive	Cadence	Reads	Writes
Content / SEO	Every 6h	Sitemap, search console, blog posts	New post drafts, schema markup, redirects
Outreach	Every 6h	Verified press CSV, prior threads	Personalized cold emails, follow-ups
Revenue Ops	Every 6h	GA4, Stripe/PayPal webhooks, pricing pages	A/B test changes, churn-risk alerts
Retrospective	Daily 22:00	All three above + business KPIs	Markdown report + 3 actions for tomorrow

Each executive is just a Python script. None of them call each other directly — they communicate via a shared state directory (./state/*.json) and a Telegram channel for human-in-the-loop.

The architecture in one diagram

   ┌─────────────────────────────────────────────────────┐
   │                  cron (every 6h)                    │
   └────────────┬───────────┬───────────┬────────────────┘
                │           │           │
        ┌───────▼──┐  ┌─────▼────┐  ┌──▼──────┐
        │ Content  │  │ Outreach │  │ RevOps  │
        │  / SEO   │  │          │  │         │
        └────┬─────┘  └─────┬────┘  └──┬──────┘
             │              │          │
             └─────┬────────┴──────────┘
                   │  writes JSON state
                   ▼
              ./state/*.json   ←── Retrospective reads daily
                   │
                   ▼
            Telegram alerts (human approves spend > $5)

The retrospective at 22:00 is the only executive that reads everyone's output and writes back recommendations for tomorrow. Think of it as the COO that nobody else reports to during the day.

A minimal Python skeleton

Each executive follows the same three-step shape: load context → call the LLM → write JSON output and a Telegram summary.

# executives/_base.py
import json, os, datetime, urllib.request, pathlib

STATE_DIR = pathlib.Path(__file__).resolve().parent.parent / "state"
STATE_DIR.mkdir(exist_ok=True)

def load_memory(name: str) -> dict:
    p = STATE_DIR / f"{name}.json"
    return json.loads(p.read_text(encoding="utf-8")) if p.exists() else {}

def save_memory(name: str, data: dict) -> None:
    p = STATE_DIR / f"{name}.json"
    p.write_text(json.dumps(data, ensure_ascii=False, indent=2), encoding="utf-8")

def tg_send(text: str) -> None:
    token = os.environ["TG_BOT_TOKEN"]
    chat  = os.environ["TG_CHAT_ID"]
    body = json.dumps({"chat_id": chat, "text": text[:4096]}).encode()
    req  = urllib.request.Request(
        f"https://api.telegram.org/bot{token}/sendMessage",
        data=body, headers={"Content-Type": "application/json"})
    urllib.request.urlopen(req, timeout=15).read()

The content executive looks like this — a real, runnable shape (LLM call is provider-agnostic, swap in your client of choice):

# executives/content_seo.py
from _base import load_memory, save_memory, tg_send
from llm import call_llm  # any chat-completion client

PROMPT = """You are the Content/SEO executive for a small SaaS.
Last 7 days you published: {recent_titles}.
Search Console queries with rising CTR: {queries}.

Pick ONE new blog topic that:
- targets a long-tail query with KD < 20
- ties back to one of our product pages
- avoids any topic shipped in the last 30 days

Return JSON: {{ "title": ..., "outline": [...], "internal_links": [...] }}.
"""

def run():
    mem = load_memory("content_seo")
    out = call_llm(PROMPT.format(
        recent_titles=mem.get("recent", []),
        queries=mem.get("rising_queries", []),
    ))
    mem.setdefault("queue", []).append(out)
    save_memory("content_seo", mem)
    tg_send(f"📝 Content: queued '{out['title']}'")

if __name__ == "__main__":
    run()

Scheduling on Linux is one crontab line per executive:

# /etc/cron.d/kunstudio_executives
0 */6 * * * root /usr/bin/python3 /opt/kunstudio/executives/content_seo.py
30 */6 * * * root /usr/bin/python3 /opt/kunstudio/executives/outreach.py
45 */6 * * * root /usr/bin/python3 /opt/kunstudio/executives/revops.py
0 22 * * * root /usr/bin/python3 /opt/kunstudio/executives/retrospective.py

On Windows I run the same scripts via Task Scheduler — same script, different glue.

Lessons that surprised me

Logs > dashboards. A logs/YYYY-MM-DD.log per executive plus the Telegram digest is enough. I deleted three Grafana dashboards.
Refuse to do user input. Each executive must run unattended. The day you add an interactive prompt is the day automation dies.
Hard token budgets. I cap each run at 6,000 prompt tokens. If context grows beyond that, the executive must summarize its own memory first. This forced clean state design.
"Approve > $5" is the single best human-in-the-loop rule. Anything below $5 the executive just does. Anything above, it asks Telegram. Zero surprise charges in six months.
Retrospective is the moat. The agent that reads everyone else's output and proposes tomorrow's three priorities is what makes the system feel like a team, not a swarm.

What it doesn't do

This is not AGI. There is no shared blackboard, no negotiation, no peer review. Each executive is a small, scoped Python script with a clear prompt and a clear output schema. The "team" emerges from cron + a shared state directory + a daily retro.

That's the whole point — boring infrastructure makes autonomy reliable.

If you want to go deeper, the inspiration for the four-role split came from running a small global SaaS for Korean-style personality reports — splitting work across role-specific prompts cut my own daily ops to under an hour. Happy to share the cron file and the retrospective prompt; ping me below.