DEV Community

Cover image for The outline.json Format That Drives My Automated Python Ebook Pipeline
German Yamil
German Yamil

Posted on

The outline.json Format That Drives My Automated Python Ebook Pipeline

The outline.json Format That Drives My Automated Python Ebook Pipeline

Everything in the pipeline starts with one file: outline.json.

It's the manifest. It defines what the pipeline generates, validates, translates, and publishes. Change the file, run the pipeline, get a different book.

Here's the full format, every field, and a real working example.


๐ŸŽ Free: AI Publishing Checklist โ€” 7 steps in Python ยท Full pipeline: germy5.gumroad.com/l/xhxkzz (pay what you want, min $9.99)


The Schema

{
  "title": "string โ€” full book title",
  "subtitle": "string โ€” used in Gumroad listing and KDP metadata",
  "author": "string",
  "language_primary": "en",
  "language_secondary": "es",
  "target_word_count": 22000,
  "chapters": [
    {
      "number": 1,
      "title": "string โ€” chapter title (used as H1)",
      "slug": "string โ€” used for filename: chapter-01-intro",
      "word_target": 2200,
      "code_file": "string โ€” script_01_intro.py",
      "learning_objective": "string โ€” what the reader can do after this chapter",
      "prerequisites": ["chapter-slug-1"],
      "tags": ["python", "automation"],
      "notes": "string โ€” optional hints for the generation prompt"
    }
  ],
  "style_guide": {
    "voice": "string",
    "audience": "string",
    "code_conventions": ["Use only Python stdlib", "All functions must have docstrings"],
    "avoid": ["passive voice", "marketing language"]
  },
  "gumroad": {
    "price_cents": 999,
    "customizable_price": true,
    "tags": ["python", "ebook", "automation"]
  }
}
Enter fullscreen mode Exit fullscreen mode

Field Reference

Top-level fields

Field Type Required Description
title string โœ… Full book title. Used in EPUB metadata and Gumroad listing
subtitle string โœ… Subtitle for KDP and Gumroad. Aim for keyword richness
author string โœ… Author name as it appears on the cover and EPUB metadata
language_primary string โœ… Source language code (en)
language_secondary string โŒ Target language for translation (es). Omit to skip translation
target_word_count integer โŒ Total book target. Used for validation: sum of chapter word_target must be within 15%
chapters array โœ… Array of chapter objects (see below)
style_guide object โŒ Injected into every chapter prompt to enforce consistent voice
gumroad object โŒ Used by gumroad_create.py to build the listing automatically

Chapter fields

Field Type Required Description
number integer โœ… Chapter number. Determines processing order and filename prefix
title string โœ… Chapter title. Used as H1 in the generated markdown
slug string โœ… Filename-safe identifier. Output files: {slug}-en.md, {slug}-es.md, {slug}.py
word_target integer โœ… Target word count for this chapter. Enforced: ยฑ15% tolerance
code_file string โœ… Output Python script name. This file goes through both validation gates
learning_objective string โœ… Injected into prompt: "After this chapter, the reader will be able to..."
prerequisites array โŒ List of chapter slugs that must be in DONE state before this chapter can start
tags array โŒ Topic tags for this chapter. Used to tune prompt focus
notes string โŒ Free-form hints injected into the generation prompt for this chapter only

style_guide fields

Field Type Description
voice string Tone descriptor injected into every prompt: "direct, technical, no marketing language"
audience string Audience definition: "Python developers with 2+ years experience"
code_conventions array Rules applied to every code block: ["Use only Python stdlib", "All variable names in English"]
avoid array Patterns to suppress: ["passive voice", "hedging language", "numbered lists for 2-item sets"]

Real Working Example

This is the actual outline.json used to produce The AI Publishing Pipeline:

{
  "title": "The AI Publishing Pipeline",
  "subtitle": "Automated Ebook System for Python Developers",
  "author": "German Yamil",
  "language_primary": "en",
  "language_secondary": "es",
  "target_word_count": 22000,
  "chapters": [
    {
      "number": 1,
      "title": "Architecture Overview: The Four-State Pipeline",
      "slug": "chapter-01-architecture",
      "word_target": 2200,
      "code_file": "script_01_state_machine.py",
      "learning_objective": "set up the chapter state machine and understand PENDING, RUNNING, DONE, NEEDS_REVIEW transitions",
      "tags": ["python", "architecture", "state-machine"]
    },
    {
      "number": 2,
      "title": "Code Validation: AST Parsing and Subprocess Isolation",
      "slug": "chapter-02-validation",
      "word_target": 2200,
      "code_file": "script_02_validation.py",
      "learning_objective": "implement two-gate code validation that prevents broken scripts from shipping",
      "prerequisites": ["chapter-01-architecture"],
      "notes": "Show both gates as composable functions. Include a deliberate failure example."
    },
    {
      "number": 3,
      "title": "Crash Recovery: Making Long Runs Resumable",
      "slug": "chapter-03-crash-recovery",
      "word_target": 2200,
      "code_file": "script_03_recovery.py",
      "learning_objective": "implement startup state normalization so any crash is safely recoverable"
    },
    {
      "number": 4,
      "title": "Translation QA: Bilingual Output with Semantic Validation",
      "slug": "chapter-04-translation",
      "word_target": 2200,
      "code_file": "script_04_translation_qa.py",
      "learning_objective": "generate Spanish translations and validate them with code fence diffing and word ratio checks"
    },
    {
      "number": 5,
      "title": "EPUB Assembly: Pandoc, Metadata, and epubcheck",
      "slug": "chapter-05-epub",
      "word_target": 2200,
      "code_file": "script_05_epub_assembly.py",
      "learning_objective": "assemble chapters into a validated EPUB3 file using Pandoc with proper metadata"
    }
  ],
  "style_guide": {
    "voice": "direct, technical, first-person singular, no marketing language",
    "audience": "Python developers with 2+ years experience who want to automate content production",
    "code_conventions": [
      "Use only Python stdlib unless the chapter is specifically about a third-party library",
      "All variable names, function names, and comments must be in English even in translated chapters",
      "Every function must have a docstring",
      "Include inline comments for non-obvious logic"
    ],
    "avoid": [
      "passive voice",
      "phrases like 'it is important to note'",
      "numbered lists for sets of 2 items (use prose instead)",
      "ending sections with 'In summary,...'"
    ]
  },
  "gumroad": {
    "price_cents": 999,
    "customizable_price": true,
    "tags": ["python", "ebook", "automation", "publishing"]
  }
}
Enter fullscreen mode Exit fullscreen mode

How the Pipeline Uses outline.json

import json

def load_outline(path: str) -> dict:
    with open(path) as f:
        outline = json.load(f)
    # Validate required fields
    assert "title" in outline
    assert "chapters" in outline and len(outline["chapters"]) > 0
    for ch in outline["chapters"]:
        assert "slug" in ch and "word_target" in ch and "code_file" in ch
    return outline

outline = load_outline("outline.json")

for chapter_def in outline["chapters"]:
    chapter = Chapter.from_dict(chapter_def)
    if chapter.state == ChapterState.DONE:
        continue  # skip already-done chapters
    process_chapter(chapter, outline["style_guide"])
Enter fullscreen mode Exit fullscreen mode

The style guide is injected into the generation prompt for every chapter. This is what makes the voice consistent across 10 chapters even though each is generated independently.

Forking for a New Book

To produce a new book:

  1. Copy the schema above
  2. Change title, subtitle, author
  3. Write your 10 chapters โ€” titles, slugs, learning_objective for each
  4. Update style_guide.audience and code_conventions for your domain
  5. Run python3 generate_chapters.py --outline outline.json

The pipeline handles everything else.

Full pipeline code: germy5.gumroad.com/l/xhxkzz โ€” pay what you want, min $9.99.


Further Reading

Top comments (0)