If you search "AI ebook generator" you will find SaaS tools that promise a book in minutes. They work fine for general non-fiction. They fail for technical content — broken code examples, shallow explanations, no AST validation, no control over the prompt structure. Building your own takes a day. It pays back on every book after the first.
Why Off-the-Shelf Tools Fall Short
The core problem is that technical ebooks require:
- Code that actually runs — SaaS generators treat code as text. There is no execution or AST check.
- Consistent terminology — "container" means something specific in Docker chapters. Generic tools drift.
- Controllable depth — a chapter on async Python needs different treatment than a chapter on CI/CD.
- Resumability — a 10-chapter generation run takes 15–20 minutes and costs API credits. If it crashes at chapter 8, you need checkpointing.
None of the major SaaS AI ebook generators handle any of these. They are content mills with an LLM front end.
The Core Generation Loop
The minimum viable AI ebook generator is a prompt template, a state machine, and an output validator.
#!/usr/bin/env python3
"""
generator.py — LLM generation loop with checkpoint.json state machine
"""
import ast
import json
import re
import sys
from pathlib import Path
import anthropic
CHECKPOINT_FILE = Path("checkpoint.json")
CHAPTERS_DIR = Path("chapters")
# ── State Machine ─────────────────────────────────────────────────────────────
def load_state() -> dict:
"""Load generation state; resume from last successful stage."""
if CHECKPOINT_FILE.exists():
state = json.loads(CHECKPOINT_FILE.read_text())
completed = sum(1 for v in state.get("chapters", {}).values() if v["status"] == "done")
print(f"Resuming: {completed} chapters already done")
return state
return {"stage": "init", "chapters": {}, "errors": []}
def save_state(state: dict):
CHECKPOINT_FILE.write_text(json.dumps(state, indent=2))
def mark_chapter(state: dict, slug: str, status: str, path: str = ""):
state["chapters"].setdefault(slug, {})
state["chapters"][slug]["status"] = status
if path:
state["chapters"][slug]["path"] = path
save_state(state)
# ── Prompt Engineering ────────────────────────────────────────────────────────
SYSTEM_PROMPT = """You are writing a chapter for a technical ebook targeting senior developers.
Rules:
- Every code example must be complete and syntactically valid Python 3.11+
- No filler phrases like "In this chapter we will explore..."
- Use concrete examples, not abstract descriptions
- Target 1100-1300 words
- Markdown output only"""
def build_chapter_prompt(chapter: dict, book_context: str) -> str:
return (
f"Book context: {book_context}\n\n"
f"Chapter title: {chapter['title']}\n"
f"Topics to cover:\n" +
"\n".join(f"- {t}" for t in chapter["topics"]) +
"\n\nWrite the full chapter now."
)
# ── Generation ────────────────────────────────────────────────────────────────
def generate_chapter(
client: anthropic.Anthropic,
chapter: dict,
book_context: str,
model: str = "claude-sonnet-4-5"
) -> str:
response = client.messages.create(
model=model,
max_tokens=4096,
system=SYSTEM_PROMPT,
messages=[{
"role": "user",
"content": build_chapter_prompt(chapter, book_context)
}]
)
return response.content[0].text
# ── Validation ────────────────────────────────────────────────────────────────
def extract_python_blocks(content: str) -> list[str]:
return re.findall(r"```
python\n(.*?)
```", content, re.DOTALL)
def validate_chapter(content: str) -> tuple[bool, list[str]]:
"""AST-check all Python fences. Returns (ok, errors)."""
errors = []
blocks = extract_python_blocks(content)
if not blocks:
# Warn but don't fail — some chapters may have no code
print(" Warning: no Python blocks found")
for i, block in enumerate(blocks):
try:
ast.parse(block)
except SyntaxError as e:
errors.append(f"Block {i}: {e.msg} at line {e.lineno}")
return len(errors) == 0, errors
# ── Main Loop ─────────────────────────────────────────────────────────────────
def run(book_config: dict):
client = anthropic.Anthropic()
state = load_state()
CHAPTERS_DIR.mkdir(exist_ok=True)
chapters = book_config["chapters"]
book_context = book_config["context"]
for chapter in chapters:
slug = chapter["slug"]
ch_state = state["chapters"].get(slug, {})
# Skip completed chapters
if ch_state.get("status") == "done":
print(f" Cached: {slug}")
continue
print(f" Generating: {chapter['title']}")
mark_chapter(state, slug, "generating")
try:
content = generate_chapter(client, chapter, book_context)
except Exception as e:
print(f" API error: {e}")
mark_chapter(state, slug, "api_error")
sys.exit(1)
# Validate
ok, errors = validate_chapter(content)
if not ok:
print(f" Validation failed:\n" + "\n".join(errors))
# Save draft anyway for inspection
draft_path = CHAPTERS_DIR / f"{slug}_DRAFT.md"
draft_path.write_text(content)
mark_chapter(state, slug, "validation_failed", str(draft_path))
# Continue to next chapter; fix manually and re-run
continue
# Persist
ch_path = CHAPTERS_DIR / f"{slug}.md"
ch_path.write_text(content)
mark_chapter(state, slug, "done", str(ch_path))
print(f" Done: {ch_path}")
done = [s for s in state["chapters"].values() if s.get("status") == "done"]
failed = [s for s in state["chapters"].values() if s.get("status") == "validation_failed"]
print(f"\nSummary: {len(done)} done, {len(failed)} need review")
# ── Config ────────────────────────────────────────────────────────────────────
if __name__ == "__main__":
config = {
"context": "A practical guide to FastAPI production deployment on AWS for senior Python developers.",
"chapters": [
{
"slug": "ch01-project-structure",
"title": "Production FastAPI Project Structure",
"topics": [
"Layered architecture (routers, services, repositories)",
"Dependency injection patterns",
"Settings management with pydantic-settings",
]
},
{
"slug": "ch02-async-patterns",
"title": "Async Patterns That Actually Matter",
"topics": [
"When async helps and when it hurts",
"Background tasks vs Celery",
"Avoiding common async pitfalls",
]
},
]
}
run(config)
The State Machine in Detail
The checkpoint.json state machine is the feature that makes this production-grade. States per chapter:
-
generating— API call in flight -
done— content written to disk, AST-clean -
validation_failed— draft saved, needs human review -
api_error— unrecoverable, pipeline halted
When you re-run after a partial failure, the loop reads checkpoint state and skips done chapters. You only pay for regeneration when validation fails or you explicitly want a fresh version.
Custom vs. Off-the-Shelf: The Real Tradeoff
| Factor | SaaS Tool | Custom Pipeline |
|---|---|---|
| Setup time | 5 minutes | 4–8 hours |
| Code validation | None | AST + optional exec |
| Prompt control | Limited templates | Full control |
| Cost per book | $15–50/month subscription | ~$1.50 API costs |
| Resumability | Usually none | Full checkpoint |
| Multi-language | Rare | Straightforward |
For non-technical content — marketing books, guides, how-tos — SaaS tools are fine. For code-heavy technical books, the custom pipeline pays for itself on the first book where you catch a broken example before it ships to readers.
Extending the Loop
The generation loop above is intentionally minimal. Common extensions:
- Temperature control: lower temperature (0.3–0.5) for more consistent code style
- Retry logic: exponential backoff on API rate limits
- Word count enforcement: post-process and flag chapters outside 1000–1400 words
- Cross-chapter consistency: pass previous chapter summaries as context
Each extension is a function you own. No SaaS tool gives you that control.
If you want the complete system — prompts, validation, state machine, Pandoc compilation, Gumroad integration — rather than building from scratch: Python Ebook Automation Pipeline ($12.99, 30-day refund).
📋 Free: AI Publishing Checklist — 7 steps to ship a technical ebook with Python (PDF, free)
Full pipeline + 10 scripts: germy5.gumroad.com/l/xhxkzz — $12.99 launch price
Top comments (0)