Building a Free Autonomous Content Pipeline with Claude CLI and Python (Max Plan, Zero Per-Token Cost)

#ai #automation #python #productivity

I run a small content pipeline that drafts articles every morning without me touching it. The interesting part isn't the automation — it's that it costs nothing beyond a subscription I already pay for. Here's how it actually works, including the parts that are annoying.

The cost trick: CLI instead of API

The Anthropic API bills per token. For a pipeline that generates several long drafts a day, that adds up fast and makes you nervous about every loop. The Claude CLI on a Max plan is different: it's a flat monthly subscription. You can call it as many times as your rate limits allow without watching a token meter.

So the core idea is simple — instead of import anthropic and an API key, I shell out to the claude command and capture stdout. Same model, no per-token accounting.

import subprocess

def claude(prompt: str, timeout: int = 300) -> str:
    result = subprocess.run(
        ["claude", "-p", prompt],   # -p = headless "print" mode
        capture_output=True,
        text=True,
        timeout=timeout,
        encoding="utf-8",
    )
    if result.returncode != 0:
        raise RuntimeError(result.stderr.strip())
    return result.stdout.strip()

claude -p runs non-interactively: it takes the prompt, prints the answer, and exits. That single function is the whole bridge between Python and the model.

Structuring the pipeline

I keep the orchestration in plain Python because it's easy to debug and doesn't need a framework. Each channel (blog post, summary, tags) is just a function that builds a prompt and calls claude().

def generate_post(topic: str) -> dict:
    body = claude(f"Write a 700-word dev article about: {topic}. "
                  "Return Markdown only.")
    title = claude(f"Write one concise SEO title for this article:\n\n{body[:800]}")
    return {"title": title, "body": body}

The pattern that matters: one responsibility per call. Asking for the article, the title, and the tags in a single prompt gives you a tangled blob that's hard to validate. Separate calls are slower but each output is trivially checkable.

The quality gate (this is the important part)

Autonomous generation drifts. The most common failure I hit was the title promising one thing and the body delivering another. So before anything gets saved, a second call acts as a judge:

def passes_gate(title: str, body: str) -> bool:
    verdict = claude(
        f"Does this body match its title? Title: {title}\n\n"
        f"Body: {body[:1500]}\n\n"
        "Answer only PASS or FAIL with one reason."
    )
    return verdict.upper().startswith("PASS")

If it fails, the item is discarded and logged — never published. Using the model to check its own output isn't perfect, but it catches the obvious drift cheaply, and on a flat plan running an extra verification call costs nothing extra.

Running it every morning

On Windows I use Task Scheduler; on Linux, cron. The script writes drafts to a folder, appends a line to a log, and exits. No daemon, no always-on bot — a single process that runs and dies is far easier to reason about than a long-lived service.

# crontab: every day at 07:00
0 7 * * * /usr/bin/python3 /home/me/pipeline/run.py >> /home/me/pipeline/run.log 2>&1

Keep one log per run with timestamps. When a morning produces garbage, the log is the only thing that tells you whether the model, the prompt, or the gate was at fault.

Honest limitations

Drafts, not publishing. I let it generate and gate, but the final post-to-platform step stays manual or semi-automated. Fully automated publishing to accounts you care about risks bans and embarrassing mistakes — not worth it.
Rate limits are real. A Max plan is generous, but a tight loop will hit limits. Add retries with backoff and cap the number of items per run.
Subscription terms. Driving the CLI from scripts should respect Anthropic's usage policies and rate limits — this is for personal automation, not reselling generations.
Quality is "good first draft," not "ship it." The gate removes obvious failures; it doesn't make the writing genuinely good. A human still edits.

Takeaway

The whole pipeline is three pieces: a subprocess wrapper around claude -p, small single-purpose generation functions, and a model-as-judge gate before anything is saved — kicked off by a scheduler once a day. The Max plan turns "every extra call costs money" into "calls are effectively free," which changes how you design: you can afford verification passes, retries, and throwaway drafts. Start with one channel, log everything, and keep the publish step human until you trust the output.

If you found this useful: I packaged 50 copy-paste AI debugging prompts + drop-in Claude Code config templates (CLAUDE.md, settings.json, MCP) into a small kit.
Launch deal: code START50 = 50% off → 50 AI Debugging Prompts + Claude Code Config Pack (about $6, 50% off applied)
New: my 10-chapter ebook Practical Claude Code — automation & unattended operation (about $9, 50% off applied)

Top comments (1)

Harjot Singh • Jun 1

using the Max plan to get zero-per-token autonomous runs is a clever cost hack, that economics question is exactly what gates this stuff at scale. Moonshift lives in the same space: agents build + deploy + market a SaaS overnight, and keeping per-run cost sane is half the battle. really like the pipeline framing. first run's free if you want to compare automation setups.