The outline.json Format That Drives My Automated Python Ebook Pipeline
Everything in the pipeline starts with one file: outline.json.
It's the manifest. It defines what the pipeline generates, validates, translates, and publishes. Change the file, run the pipeline, get a different book.
Here's the full format, every field, and a real working example.
๐ Free: AI Publishing Checklist โ 7 steps in Python ยท Full pipeline: germy5.gumroad.com/l/xhxkzz (pay what you want, min $9.99)
The Schema
{
"title": "string โ full book title",
"subtitle": "string โ used in Gumroad listing and KDP metadata",
"author": "string",
"language_primary": "en",
"language_secondary": "es",
"target_word_count": 22000,
"chapters": [
{
"number": 1,
"title": "string โ chapter title (used as H1)",
"slug": "string โ used for filename: chapter-01-intro",
"word_target": 2200,
"code_file": "string โ script_01_intro.py",
"learning_objective": "string โ what the reader can do after this chapter",
"prerequisites": ["chapter-slug-1"],
"tags": ["python", "automation"],
"notes": "string โ optional hints for the generation prompt"
}
],
"style_guide": {
"voice": "string",
"audience": "string",
"code_conventions": ["Use only Python stdlib", "All functions must have docstrings"],
"avoid": ["passive voice", "marketing language"]
},
"gumroad": {
"price_cents": 999,
"customizable_price": true,
"tags": ["python", "ebook", "automation"]
}
}
Field Reference
Top-level fields
| Field | Type | Required | Description |
|---|---|---|---|
title |
string | โ | Full book title. Used in EPUB metadata and Gumroad listing |
subtitle |
string | โ | Subtitle for KDP and Gumroad. Aim for keyword richness |
author |
string | โ | Author name as it appears on the cover and EPUB metadata |
language_primary |
string | โ | Source language code (en) |
language_secondary |
string | โ | Target language for translation (es). Omit to skip translation |
target_word_count |
integer | โ | Total book target. Used for validation: sum of chapter word_target must be within 15% |
chapters |
array | โ | Array of chapter objects (see below) |
style_guide |
object | โ | Injected into every chapter prompt to enforce consistent voice |
gumroad |
object | โ | Used by gumroad_create.py to build the listing automatically |
Chapter fields
| Field | Type | Required | Description |
|---|---|---|---|
number |
integer | โ | Chapter number. Determines processing order and filename prefix |
title |
string | โ | Chapter title. Used as H1 in the generated markdown |
slug |
string | โ | Filename-safe identifier. Output files: {slug}-en.md, {slug}-es.md, {slug}.py
|
word_target |
integer | โ | Target word count for this chapter. Enforced: ยฑ15% tolerance |
code_file |
string | โ | Output Python script name. This file goes through both validation gates |
learning_objective |
string | โ | Injected into prompt: "After this chapter, the reader will be able to..." |
prerequisites |
array | โ | List of chapter slugs that must be in DONE state before this chapter can start |
tags |
array | โ | Topic tags for this chapter. Used to tune prompt focus |
notes |
string | โ | Free-form hints injected into the generation prompt for this chapter only |
style_guide fields
| Field | Type | Description |
|---|---|---|
voice |
string | Tone descriptor injected into every prompt: "direct, technical, no marketing language"
|
audience |
string | Audience definition: "Python developers with 2+ years experience"
|
code_conventions |
array | Rules applied to every code block: ["Use only Python stdlib", "All variable names in English"]
|
avoid |
array | Patterns to suppress: ["passive voice", "hedging language", "numbered lists for 2-item sets"]
|
Real Working Example
This is the actual outline.json used to produce The AI Publishing Pipeline:
{
"title": "The AI Publishing Pipeline",
"subtitle": "Automated Ebook System for Python Developers",
"author": "German Yamil",
"language_primary": "en",
"language_secondary": "es",
"target_word_count": 22000,
"chapters": [
{
"number": 1,
"title": "Architecture Overview: The Four-State Pipeline",
"slug": "chapter-01-architecture",
"word_target": 2200,
"code_file": "script_01_state_machine.py",
"learning_objective": "set up the chapter state machine and understand PENDING, RUNNING, DONE, NEEDS_REVIEW transitions",
"tags": ["python", "architecture", "state-machine"]
},
{
"number": 2,
"title": "Code Validation: AST Parsing and Subprocess Isolation",
"slug": "chapter-02-validation",
"word_target": 2200,
"code_file": "script_02_validation.py",
"learning_objective": "implement two-gate code validation that prevents broken scripts from shipping",
"prerequisites": ["chapter-01-architecture"],
"notes": "Show both gates as composable functions. Include a deliberate failure example."
},
{
"number": 3,
"title": "Crash Recovery: Making Long Runs Resumable",
"slug": "chapter-03-crash-recovery",
"word_target": 2200,
"code_file": "script_03_recovery.py",
"learning_objective": "implement startup state normalization so any crash is safely recoverable"
},
{
"number": 4,
"title": "Translation QA: Bilingual Output with Semantic Validation",
"slug": "chapter-04-translation",
"word_target": 2200,
"code_file": "script_04_translation_qa.py",
"learning_objective": "generate Spanish translations and validate them with code fence diffing and word ratio checks"
},
{
"number": 5,
"title": "EPUB Assembly: Pandoc, Metadata, and epubcheck",
"slug": "chapter-05-epub",
"word_target": 2200,
"code_file": "script_05_epub_assembly.py",
"learning_objective": "assemble chapters into a validated EPUB3 file using Pandoc with proper metadata"
}
],
"style_guide": {
"voice": "direct, technical, first-person singular, no marketing language",
"audience": "Python developers with 2+ years experience who want to automate content production",
"code_conventions": [
"Use only Python stdlib unless the chapter is specifically about a third-party library",
"All variable names, function names, and comments must be in English even in translated chapters",
"Every function must have a docstring",
"Include inline comments for non-obvious logic"
],
"avoid": [
"passive voice",
"phrases like 'it is important to note'",
"numbered lists for sets of 2 items (use prose instead)",
"ending sections with 'In summary,...'"
]
},
"gumroad": {
"price_cents": 999,
"customizable_price": true,
"tags": ["python", "ebook", "automation", "publishing"]
}
}
How the Pipeline Uses outline.json
import json
def load_outline(path: str) -> dict:
with open(path) as f:
outline = json.load(f)
# Validate required fields
assert "title" in outline
assert "chapters" in outline and len(outline["chapters"]) > 0
for ch in outline["chapters"]:
assert "slug" in ch and "word_target" in ch and "code_file" in ch
return outline
outline = load_outline("outline.json")
for chapter_def in outline["chapters"]:
chapter = Chapter.from_dict(chapter_def)
if chapter.state == ChapterState.DONE:
continue # skip already-done chapters
process_chapter(chapter, outline["style_guide"])
The style guide is injected into the generation prompt for every chapter. This is what makes the voice consistent across 10 chapters even though each is generated independently.
Forking for a New Book
To produce a new book:
- Copy the schema above
- Change
title,subtitle,author - Write your 10
chaptersโ titles, slugs,learning_objectivefor each - Update
style_guide.audienceandcode_conventionsfor your domain - Run
python3 generate_chapters.py --outline outline.json
The pipeline handles everything else.
Full pipeline code: germy5.gumroad.com/l/xhxkzz โ pay what you want, min $9.99.
Top comments (0)