Continuous Book Publishing with Markdown, Pandoc, and AI-Readable Editions
Writing a book becomes harder as the book grows.
Not only because of writing itself, but because of the growing write–read cycle.
Every new chapter increases:
- context size
- consistency management
- restructuring costs
- update overhead
At some point, large books start behaving similarly to large software systems.
And software engineering already solved many of these problems years ago:
- modularity
- separation of concerns
- automation
- build pipelines
- continuous delivery
So while writing a playbook on business automation, I started applying the same principles to book publishing itself.
The Idea
Instead of maintaining one giant document, I split the book into modular Markdown files:
001-introduction.md
002-automation.md
003-ai-systems.md
Each chapter becomes an independent unit.
Then I built a small publishing pipeline around it.
Now the workflow looks like this:
edit chapters
→ press build
→ automatically generate:
- styled PDF edition
- AI-readable markdown edition
Very similar to software deployment pipelines.
Why AI-Readable Books Matter
Books increasingly have two audiences:
humans
+
machines
A growing amount of reading will happen through:
- ChatGPT
- Claude
- Gemini
- AI agents
- RAG systems
- internal knowledge systems
Instead of manually reading hundreds of pages, people will increasingly upload books into AI systems and query them conversationally.
So books need to become:
- machine-readable
- searchable
- modular
- AI-compatible
That is why the pipeline generates both:
- a styled PDF for humans
- a combined Markdown edition for LLMs
from the same source files.
Project Structure
book/
chapters/
dist/
scripts/
styles/
Example chapters:
book/chapters/001-introduction.md
book/chapters/002-automation.md
Tooling
The stack is intentionally simple:
- VSCode
- Markdown
- Pandoc
- CSS
- PowerShell
- wkhtmltopdf
No CMS.
No publishing platform.
No vendor lock-in.
Build Script
The build script:
- combines chapters
- converts Markdown → HTML
- injects styles
- generates PDF
- exports combined Markdown edition
Example:
$chapterFiles = Get-ChildItem `
"book/chapters/*" `
| Sort-Object Name
Then:
pandoc `
combined.html `
--pdf-engine=wkhtmltopdf `
-o book.pdf
Core Principle
The entire workflow is built around:
content
≠
presentation
Meaning:
- Markdown = ideas
- CSS = styling
- scripts = automation
This keeps the system scalable.
Why I Think This Matters
I think books will increasingly behave like software systems:
- continuously updated
- modular
- machine-readable
- automatically published
Especially:
- technical books
- playbooks
- operational documentation
- AI-native knowledge systems
Repository
The full automation setup and playbook are available here:
https://github.com/MichaelZelensky/automated-book-publish/
If you are writing books, large documentation systems, or AI-compatible knowledge bases, this approach may help.
Top comments (0)