German Yamil

Posted on Apr 16 • Edited on May 16

Python Ebook Automation in 2026: The Complete Stack for Solo Developers

#python #selfpublishing #tutorial #automation

Python ebook automation in 2026 is no longer experimental. The tooling is stable, the APIs are reliable, and the economics work for solo developers. This is the full stack, no fluff.

🎁 Free resource: AI Publishing Checklist — 7 steps to ship a technical ebook with Python (free, no email required) · Full pipeline + 10 scripts: germy5.gumroad.com/l/xhxkzz (pay what you want, min $9.99)

The Four Layers

🚀 If this is useful to you: The complete pipeline — state machine, AST + subprocess validation, bilingual EPUB assembly, and Gumroad API integration — is available as a ready-to-run system at germy5.gumroad.com/l/xhxkzz ($9.99, 30-day money-back guarantee

Every production-ready pipeline has these four layers:

Generation — Claude API (claude-opus-4-5 or sonnet) writes chapter drafts
Validation — AST parsing + subprocess checks enforce technical correctness
Compilation — Pandoc converts Markdown to EPUB (KDP-compliant)
Distribution — Gumroad API publishes instantly; KDP gets a manual upload

Here is how they connect in a single orchestrator.

The Orchestrator Script

#!/usr/bin/env python3
"""
ebook_orchestrator.py — Full pipeline: generate → validate → compile → publish
"""
import json
import subprocess
import sys
from pathlib import Path
import anthropic
import requests

CHECKPOINT = Path("checkpoint.json")
GUMROAD_TOKEN = "your_gumroad_token"
PRODUCT_ID = "xhxkzz"

def load_state() -> dict:
    if CHECKPOINT.exists():
        return json.loads(CHECKPOINT.read_text())
    return {"stage": "init", "chapters": {}}

def save_state(state: dict):
    CHECKPOINT.write_text(json.dumps(state, indent=2))

# --- Layer 1: Generation ---
def generate_chapter(client: anthropic.Anthropic, title: str, outline: str) -> str:
    print(f"  Generating: {title}")
    message = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=4096,
        messages=[{
            "role": "user",
            "content": (
                f"Write a technical ebook chapter titled '{title}'.\n"
                f"Outline:\n{outline}\n\n"
                "Requirements: Python code examples, no filler, ~1200 words."
            )
        }]
    )
    return message.content[0].text

# --- Layer 2: Validation ---
def validate_chapter(content: str, chapter_path: Path) -> bool:
    chapter_path.write_text(content)
    # Extract and AST-check all Python fences
    import ast, re
    blocks = re.findall(r"```

python\n(.*?)

```", content, re.DOTALL)
    for i, block in enumerate(blocks):
        try:
            ast.parse(block)
        except SyntaxError as e:
            print(f"  SyntaxError in block {i}: {e}")
            return False
    print(f"  Validated {len(blocks)} code block(s)")
    return True

# --- Layer 3: Compilation ---
def compile_epub(chapters_dir: Path, output_path: Path, metadata: dict) -> Path:
    chapter_files = sorted(chapters_dir.glob("ch*.md"))
    cmd = [
        "pandoc",
        "--from", "markdown",
        "--to", "epub3",
        "--metadata", f"title={metadata['title']}",
        "--metadata", f"author={metadata['author']}",
        "--epub-cover-image", metadata["cover"],
        "-o", str(output_path),
        *[str(f) for f in chapter_files]
    ]
    result = subprocess.run(cmd, capture_output=True, text=True)
    if result.returncode != 0:
        raise RuntimeError(f"Pandoc failed: {result.stderr}")
    print(f"  EPUB compiled: {output_path} ({output_path.stat().st_size // 1024} KB)")
    return output_path

# --- Layer 4: Distribution ---
def update_gumroad_product(epub_path: Path, price_cents: int = 1299) -> dict:
    url = f"https://api.gumroad.com/v2/products/{PRODUCT_ID}"
    with open(epub_path, "rb") as f:
        response = requests.post(url, data={
            "access_token": GUMROAD_TOKEN,
            "price": price_cents,
            "name": "Python Ebook Automation Pipeline",
        }, files={"url": f})
    response.raise_for_status()
    return response.json()

# --- Main Orchestrator ---
def run_pipeline(outlines: list[dict], metadata: dict):
    client = anthropic.Anthropic()
    state = load_state()
    chapters_dir = Path("chapters")
    chapters_dir.mkdir(exist_ok=True)

    # Stage 1: Generate & validate
    if state["stage"] in ("init", "generating"):
        state["stage"] = "generating"
        for item in outlines:
            key = item["title"]
            if key in state["chapters"]:
                print(f"  Skipping (cached): {key}")
                continue
            content = generate_chapter(client, item["title"], item["outline"])
            path = chapters_dir / item["filename"]
            if validate_chapter(content, path):
                state["chapters"][key] = str(path)
                save_state(state)
            else:
                print(f"  Validation failed for {key}. Fix and re-run.")
                sys.exit(1)
        state["stage"] = "compiling"
        save_state(state)

    # Stage 2: Compile
    if state["stage"] == "compiling":
        epub = compile_epub(chapters_dir, Path("book.epub"), metadata)
        state["stage"] = "publishing"
        state["epub"] = str(epub)
        save_state(state)

    # Stage 3: Publish
    if state["stage"] == "publishing":
        result = update_gumroad_product(Path(state["epub"]))
        print(f"  Published: {result['product']['short_url']}")
        state["stage"] = "done"
        save_state(state)

    print("Pipeline complete.")

if __name__ == "__main__":
    outlines = [
        {
            "title": "Setting Up the Generation Pipeline",
            "filename": "ch01.md",
            "outline": "Claude API auth, prompt structure, streaming vs batch"
        },
        {
            "title": "Code Validation with AST",
            "filename": "ch02.md",
            "outline": "ast.parse, extracting fences, subprocess test runner"
        },
    ]
    metadata = {
        "title": "Python Ebook Automation",
        "author": "Your Name",
        "cover": "cover.jpg"
    }
    run_pipeline(outlines, metadata)

Why Each Layer Matters

Generation without validation is unreliable. LLMs produce plausible-looking but broken code. The AST pass catches syntax errors before they reach readers.

Validation without compilation is incomplete. Pandoc's epub3 output is KDP-compliant out of the box if you pass the right flags. Doing it manually introduces formatting errors.

Publishing without a product link is invisible. Gumroad's API lets you update the file programmatically so your buy page URL never changes across editions.

Runtime Expectations

On a 10-chapter book:

Generation: 8–12 minutes (rate limits permitting)
Validation: under 5 seconds
Pandoc compilation: under 30 seconds
Gumroad API call: under 2 seconds

Total wall time: roughly 15 minutes of active pipeline, plus however long you spend reviewing the output.

State Machine Design

The checkpoint.json approach is non-negotiable for production. Claude API calls cost money. If compilation fails after 10 successful chapters, you want to resume from compiling, not regenerate everything.

The state transitions are: init → generating → compiling → publishing → done. Any failure leaves the state at the failing stage so the next run resumes correctly.

What This Stack Costs

Claude API: ~$0.50–$2.00 per book depending on model and length
Gumroad: 10% + payment fees per sale
KDP: 35–70% royalty depending on pricing tier
Pandoc: free
Python runtime: whatever you already run

At $9.99, 30-day money-back guarantee

This pipeline is documented in full — prompts, validation logic, state machine, Gumroad + KDP integration — in the Python Ebook Automation Pipeline guide ($9.99, 30-day money-back guarantee

If this saved you time, the ❤️ button helps other developers find it.

Get the Full Pipeline

This article is part of the Python AI Publishing Pipeline series — a complete system to write, validate, and publish technical ebooks with Python and Claude.

📋 Free checklist: 7 steps to ship a Python ebook — PDF, no email required.

🚀 Full pipeline + source code: germy5.gumroad.com/l/xhxkzz — $9.99, 30-day money-back guarantee.

DEV Community