DEV Community

Michael Smith
Michael Smith

Posted on

I Built a Git-Tracked Book Production Pipeline

I Built a Git-Tracked Book Production Pipeline

Meta Description: Discover how I built a Git-tracked book production pipeline that streamlined my entire publishing workflow — from manuscript drafts to print-ready files. Here's exactly how it works.


TL;DR

I replaced a chaotic folder full of final_FINAL_v3_REAL.docx files with a fully automated, Git-tracked book production pipeline. Using plain-text source files, version control, and automated build scripts, I now produce print-ready PDFs, ePubs, and web previews from a single command. This article walks through every component, the tools involved, and how you can replicate it for your own book projects.


Why I Needed a Better Book Production System

Anyone who has written a book — or even a long-form technical document — knows the folder graveyard problem. You start with Chapter1.docx, and six months later you're staring at Chapter1_edits_FINAL_editorNotes_USE_THIS_ONE.docx and genuinely cannot remember which version your copy editor approved.

I hit this wall while producing a technical book in early 2025. I had three collaborators, two rounds of professional editing, a layout designer working in parallel, and a publisher asking for both print and digital formats. The file management alone was consuming hours every week.

The solution I landed on — a Git-tracked book production pipeline — didn't just solve the version control problem. It transformed how I think about writing and publishing as a software engineering problem. And the results were dramatic: fewer errors, faster iteration cycles, and a reproducible output that I can rebuild from scratch at any time.

[INTERNAL_LINK: version control for writers]


What Is a Git-Tracked Book Production Pipeline?

At its core, a Git-tracked book production pipeline treats your manuscript the same way software developers treat source code:

  • Source content lives in plain-text files (Markdown, AsciiDoc, or LaTeX)
  • Version control (Git) tracks every change, who made it, and why
  • Build tools compile those source files into finished formats (PDF, ePub, MOBI, HTML)
  • Automation handles repetitive tasks like formatting, linting, and file generation

The analogy is exact: just as a developer checks in code and a CI/CD pipeline compiles and tests it, I check in manuscript changes and a build pipeline compiles a polished, formatted book.


The Full Stack: Tools I Actually Use

Here's an honest breakdown of every tool in my pipeline, including what I'd change if I were starting over.

Source Format: Markdown + Pandoc

I write in Markdown using plain .md files. Every chapter is a separate file. Frontmatter metadata (title, author, ISBN, edition) lives in a book.yaml configuration file.

Pandoc is the backbone of the entire system. It converts Markdown to virtually any output format — PDF via LaTeX, ePub, HTML, DOCX — using a single command. Pandoc is free, open source, and frankly one of the most underrated tools in the writing world.

Honest assessment: Pandoc's default output is functional but not beautiful. You'll spend real time customizing LaTeX templates for print-quality PDFs. That investment pays off, but budget a week for it.

Version Control: Git + GitHub

GitHub hosts the repository. Every collaborator works on branches. Editorial changes come in as pull requests, which means I can review a copy editor's suggestions line by line before accepting them — exactly like code review.

The .gitignore file excludes compiled outputs (PDFs, ePubs) from the repository, keeping it lean. Only source files are tracked.

Branching strategy I use:

  • main — always production-ready, tagged at each edition release
  • drafts/chapter-XX — active writing branches
  • editorial/round-1 — editor's tracked changes as a branch
  • layout/print-v1 — layout-specific adjustments

Build Automation: Make + Shell Scripts

A Makefile at the root of the project defines build targets:

pdf:
    pandoc $(CHAPTERS) -o dist/book.pdf --template=templates/print.tex

epub:
    pandoc $(CHAPTERS) -o dist/book.epub --epub-cover-image=assets/cover.jpg

html:
    pandoc $(CHAPTERS) -o dist/preview/index.html --template=templates/web.html
Enter fullscreen mode Exit fullscreen mode

Running make pdf from the terminal produces a fully formatted PDF in seconds. Running make all builds every output format simultaneously.

Why Make over a fancier tool? It's available everywhere, has no dependencies, and any collaborator can understand it in ten minutes.

Continuous Integration: GitHub Actions

Every push to main triggers a GitHub Actions workflow that:

  1. Installs Pandoc and a minimal LaTeX distribution (TeX Live)
  2. Runs the full build
  3. Uploads compiled artifacts (PDF, ePub) to a release draft
  4. Runs a custom linting script that checks for common errors

That linting script is worth highlighting. It catches things like:

  • Unclosed footnotes
  • Missing image alt text
  • Chapter cross-references pointing to nonexistent anchors
  • Inconsistent character name spellings (yes, this happens)

Typography and Layout: LaTeX Templates

For print-quality output, I maintain a custom LaTeX template based on the memoir class. This handles:

  • Chapter opening styles
  • Running headers and footers
  • Drop caps and ornamental dividers
  • Proper hyphenation and justification
  • Bleed-ready margins for print-on-demand

This is the hardest part of the pipeline to set up, but once it's done, every future book inherits it.

Spell-Check and Style Linting: Vale

Vale is a command-line prose linter that enforces style rules. I've configured it with:

  • A custom style guide (consistent terminology, preferred spellings)
  • The Microsoft Writing Style Guide rules
  • A dictionary of technical terms specific to my book's subject matter

Vale runs as part of the CI pipeline, and pull requests that introduce style violations are flagged automatically.


How the Workflow Actually Runs Day-to-Day

Here's what a typical writing session looks like:

  1. Pull latest changes from main to sync with any collaborator edits
  2. Create or switch to a chapter branch (git checkout -b drafts/chapter-07)
  3. Write in Markdown using Obsidian or VS Code with the Markdown preview extension
  4. Run make pdf locally to preview the formatted output
  5. Commit with a meaningful message (git commit -m "Chapter 7: Add section on distributed systems")
  6. Push and open a pull request — this triggers CI, which builds all formats and runs linters
  7. Merge when green — the compiled artifacts are automatically attached to the latest release draft

For editorial rounds, I export a clean DOCX from Pandoc, send it to the copy editor, and receive it back with tracked changes. I then convert those changes manually back to Markdown (or use a diff tool). It's the one part of the pipeline that isn't fully automated, and it's a known pain point.

[INTERNAL_LINK: working with copy editors on technical books]


Comparison: Traditional Workflow vs. Git-Tracked Pipeline

Feature Traditional (Word/Google Docs) Git-Tracked Pipeline
Version history Manual file naming Full commit history
Collaboration Email attachments or shared drives Pull requests, branching
Output formats Manual export, per-format Single command, all formats
Reproducibility Low (depends on local software) High (containerizable)
Linting/QA Manual proofreading only Automated style + structure checks
Learning curve Low Medium-High
Cost Low to moderate Low (mostly free tools)
Recovery from errors Difficult git revert or git checkout

The Biggest Wins (And Honest Limitations)

What Worked Exceptionally Well

Reproducibility. Three months after finishing the first edition, my publisher asked for a corrected reprint with 12 specific changes. I made the edits, ran make pdf, and had a corrected print-ready file in 20 minutes. With a traditional workflow, that would have meant re-opening InDesign files, re-checking styles, and manually re-exporting.

Collaboration transparency. Every change has an author, a timestamp, and a commit message. When my technical reviewer suggested restructuring Chapter 4, I could see exactly what they changed and why — and revert it if needed.

Multi-format output. The same source files produce the print PDF, the ePub for digital retailers, and an HTML preview site. There's no manual reformatting between formats.

Honest Limitations

The learning curve is real. If you're not comfortable with the command line, Git, and at least basic LaTeX, this pipeline will feel overwhelming at first. I'd estimate 20-40 hours of setup time before it feels natural.

Rich media is harder. Complex layouts — sidebars, callout boxes, two-column layouts — require custom LaTeX or CSS work. Pandoc handles linear prose beautifully but struggles with magazine-style layouts.

Collaborator buy-in. Getting a non-technical editor or co-author to use Git is a significant ask. I've solved this partly by keeping the editorial round as a DOCX exchange, but it's a genuine friction point.

[INTERNAL_LINK: teaching non-technical collaborators to use Git]


Getting Started: Your First Git-Tracked Manuscript

If you want to replicate this pipeline, here's a practical starting point:

Minimum Viable Setup (Weekend Project)

  1. Install Git and create a GitHub account (free)
  2. Install Pandoc (free, available for all platforms)
  3. Create a repository with a /chapters folder and a book.yaml
  4. Write one chapter in Markdown
  5. Write a 5-line Makefile that runs pandoc chapters/*.md -o book.pdf
  6. Commit and push

That's it. You now have a version-controlled manuscript that produces a PDF on demand. Everything else — CI, linting, custom templates — is an incremental improvement you can add over time.

Recommended Learning Resources

  • Pro Git Book — free, comprehensive, the definitive Git reference
  • Pandoc Documentation — dense but thorough
  • The memoir LaTeX class documentation (search CTAN) — essential for print layout

Key Takeaways

  • A Git-tracked book production pipeline treats your manuscript as source code — versioned, automated, and reproducible
  • Pandoc + Git + Make is a powerful free stack that handles most book production needs
  • GitHub Actions enables automated builds and linting on every commit
  • The biggest ROI comes from reproducibility: you can rebuild any version of your book from any point in history
  • The biggest challenge is the learning curve and collaborator friction — plan for it
  • Start small: even a basic Pandoc + Git setup is dramatically better than manual file management
  • Custom LaTeX templates are the hardest part but unlock professional print-quality output

Ready to Build Your Own Pipeline?

If you're writing a book — technical or otherwise — and you're tired of file chaos and manual formatting, I'd genuinely recommend investing the time to set this up. The upfront cost is real, but every subsequent book benefits from the infrastructure you've already built.

Start this weekend: Install Pandoc, create a GitHub repository, and write your first chapter in Markdown. Run pandoc chapter1.md -o chapter1.pdf and see your formatted output in seconds. That small win will make the rest of the pipeline feel worth building.

Have questions about a specific part of this setup? Drop them in the comments — I read every one and respond to technical questions directly.

[INTERNAL_LINK: book production tools comparison 2026]


Frequently Asked Questions

Q: Do I need to know how to code to use this pipeline?
Not exactly, but you need to be comfortable with the command line, basic Git operations, and editing text configuration files. If you can follow a tutorial and aren't afraid of a terminal window, you can build this. Coding experience helps for customizing LaTeX templates and writing linting scripts, but it's not a hard requirement for the core workflow.

Q: Can I use AsciiDoc or LaTeX instead of Markdown as my source format?
Absolutely. AsciiDoc (processed by Asciidoctor) is arguably better suited for technical books with complex cross-references, admonitions, and code blocks. LaTeX gives you the most control but has the steepest learning curve. Markdown with Pandoc extensions is the easiest entry point, which is why I recommend it for most people starting out.

Q: How do you handle images and figures in the pipeline?
Images live in an /assets folder in the repository. I reference them in Markdown with standard syntax, and Pandoc resolves the paths during the build. For print, I keep source images at 300 DPI minimum. SVG files work well because they scale cleanly to any output format. Large image files can bloat the repository — I use Git LFS (Large File Storage) for anything over 1 MB.

Q: What about working with a traditional publisher who expects DOCX or InDesign files?
This is a real tension. Pandoc can export clean DOCX files that most editors can work with. For InDesign, the workflow is more complex — some publishers accept ICML (InCopy Markup Language), which Pandoc can generate. In practice, I maintain the Git pipeline internally and export to whatever format my publisher requires at submission milestones. The source of truth is always the Git repository.

Q: Is this pipeline suitable for fiction books, or is it mainly for technical writing?
It works well for both, though the tooling emphasis differs. Fiction writers typically need less of the technical apparatus (code block formatting, cross-references, complex figures) and more attention to typography and chapter styling. The core Git + Pandoc + Make stack is format-agnostic. Several novelists I know use exactly this approach, with a simpler LaTeX template focused on clean prose presentation rather than technical layout features.

Top comments (0)