Show Dev: I Built a Python Pipeline That Writes, Validates, and Publishes Bilingual Ebooks
Six weeks ago I had an idea that felt slightly ridiculous.
What if I built an automated pipeline that generates a technical ebook — and then used that pipeline to produce the ebook that documents itself?
The ebook about the pipeline would be the proof that the pipeline works.
🎁 Free resource: AI Publishing Checklist — 7 steps to ship a technical ebook with Python (free, no email required) · Full pipeline: germy5.gumroad.com/l/xhxkzz (pay what you want, min $9.99)
What I built
A Python pipeline that:
- Takes an
outline.json(10 chapters, each with title, word target, code deliverable) - Generates each chapter using the Claude API
- Validates every code snippet through two gates before advancing
- Translates each chapter to Spanish with QA checks
- Assembles two EPUBs (EN + ES) with Pandoc
- Creates the Gumroad product listing via API
Total active time per book: 4–6 hours. The pipeline runs the rest unattended.
The core: two-gate code validation
Most technical ebooks have code that was never tested. I made that impossible.
Gate 1: AST parsing
import ast
def validate_syntax(code: str) -> bool:
try:
ast.parse(code)
return True
except SyntaxError as e:
print(f"Syntax error at line {e.lineno}: {e.msg}")
return False
Gate 2: Subprocess isolation
import subprocess, tempfile, os
def validate_execution(code: str, timeout: int = 30) -> bool:
with tempfile.TemporaryDirectory() as tmpdir:
path = os.path.join(tmpdir, "test.py")
with open(path, "w") as f:
f.write(code)
result = subprocess.run(
["python3", path],
capture_output=True, timeout=timeout, cwd=tmpdir
)
if result.returncode != 0:
print(result.stderr.decode())
return False
return True
A chapter only reaches DONE state when both return True. There is no override.
The state machine
PENDING → RUNNING → DONE
↘ NEEDS_REVIEW → (fix) → PENDING
Every state change writes to disk immediately. If the process crashes mid-generation, the next run resets RUNNING chapters to PENDING and skips DONE ones. I've had 3 crashes during production — no data loss, no re-doing finished chapters.
Translation QA
After English generation, the pipeline generates Spanish and checks:
-
Code fence count — EN and ES must have identical
\\` fence pairs. Mismatch = dropped code block = hard failure - Word ratio — Spanish typically runs 10–15% longer than English. Deviation > 20% flags for review
python
def validate_translation(en_content: str, es_content: str) -> bool:
import re
en_fences = len(re.findall(r'', en_content))
es_fences = len(re.findall(r'', es_content))
if en_fences != es_fences:
raise ValueError(f"Fence mismatch: EN={en_fences}, ES={es_fences}")
en_words = len(en_content.split())
es_words = len(es_content.split())
ratio = abs(en_words - es_words) / en_words
if ratio > 0.20:
print(f"Word ratio warning: {ratio:.2%} deviation")
return True
The economics (stated plainly)
| Item | Value |
|---|---|
| Infrastructure cost | $20/month (Claude Code Pro only) |
| Price | $9.99+ (pay what you want) |
| Break-even | 2 sales |
| Time per book | 4–6 hours active |
| Marginal cost, book #10 | Same as book #1 |
What failed and what I learned
Failure 1: Published 10 articles on one day in April. Dev.to and Google suppressed the batch. Average 11 views/article for those vs. 54 views for the ones I published individually.
Lesson: Space content by at least 24 hours. One article per day maximum.
Failure 2: Translation sometimes produced code with Spanish variable names. Added explicit instruction to the prompt: "All variable names, function names, and comments must remain in English."
Failure 3: Some generated scripts used pandas or numpy which aren't in the clean subprocess environment. Fixed by adding to the prompt: "Use only Python stdlib. No third-party imports."
Failure 4 (ongoing): 0 sales so far after 16 days. 268 Dev.to views total. The math says I need ~3,000–5,000 views before expecting consistent sales. Working on volume.
The meta-proof
The ebook that documents this pipeline was produced by this pipeline.
Every one of its 10 chapters passed both validation gates before shipping. The Spanish edition was checked by the translation QA script. The EPUB was assembled and validated by epubcheck with zero errors.
I could claim this. Or I could build a system where it's the only possible outcome. I chose the second.
Free 7-step checklist: germy5.gumroad.com/l/vlvhld — free, no email
Full pipeline (10 scripts + complete ebook): germy5.gumroad.com/l/xhxkzz — pay what you want, min $9.99, 30-day refund
Questions? What part of the architecture would you build differently? Drop a comment — genuinely curious.
Top comments (0)