The problem
Every time I needed to put together a report or proposal I was doing the
same manual process — export this PDF, convert that Word doc, grab slides 3-5 from a PowerPoint, then merge everything in some online tool that wanted me to upload confidential documents to their servers.
It's a solved problem that somehow still takes 15 minutes every time.
So I built PageFuse.
What it does
PageFuse is a CLI tool that assembles pages from multiple document formats
into a single output file.
bash
pagefuse assemble board_pack.pdf cover.pdf:1 financials.docx:all slides.pptx:2-5
That pulls page 1 from a PDF, all pages from a Word doc, and slides 2-5
from a PowerPoint into one PDF. Done in seconds.
---
Page specs
pagefuse assemble out.pdf file.pdf:1 # single page
pagefuse assemble out.pdf file.pdf:1-3 # range
pagefuse assemble out.pdf file.pdf:1,3,5-8 # mixed
pagefuse assemble out.pdf file.pdf:all # all pages
---
Config files for repeatable builds
For documents you rebuild regularly — weekly reports, monthly packs,
proposals — save a .fuse config:
output: board_pack.pdf
output: board_pack.docx
from: cover.pdf 1
from: financials.docx all
from: slides.pptx 2-5
Then just:
pagefuse assemble board_pack.fuse
Commit the config to your repo. Run it in a Makefile or CI pipeline.
Same output every time.
---
Split works too
pagefuse split report.pdf cover.pdf:1 body.pdf:2-10 appendix.docx:11-20
Each output can be a different format.
---
Supported formats
┌────────┬─────────────────────────────────────────────────────────────────────────────────────┐
│ │ Formats │
├────────┼─────────────────────────────────────────────────────────────────────────────────────┤
│ Input │ PDF, DOCX, DOC, PPTX, PPT, ODT, ODP, ODS, XLSX, RTF, HTML, Markdown, PNG, JPG, TIFF │
├────────┼─────────────────────────────────────────────────────────────────────────────────────┤
│ Output │ PDF, DOCX, ODT, PPTX, ODP, HTML, PNG, JPG, TIFF │
└────────┴─────────────────────────────────────────────────────────────────────────────────────┘
---
How it's built
- Click — CLI framework
- pikepdf — PDF read/write/assembly (lossless, no re-rendering)
- LibreOffice headless — DOCX/PPTX/ODT/HTML conversion
- img2pdf — lossless image → PDF
- pypdfium2 — PDF → image rendering
- Rich — terminal output
- ThreadPoolExecutor — parallel file loading
PDF-to-PDF assembly is lossless and fast — no rendering involved.
Non-PDF inputs go through LibreOffice headless for conversion,
assembled to a temp PDF first, then converted to the target format.
---
Install
pip install pagefuse
# or
pipx install pagefuse
Requires LibreOffice for DOCX/PPTX/ODT/HTML conversion:
sudo apt install libreoffice # Ubuntu/Debian
brew install --cask libreoffice # macOS
---
What's next
- GUI wrapper
- Homebrew formula
- Watch mode for auto-rebuilding on file change
---
30-day free trial, no credit card: pagefuse.net
Would love feedback on the config format, the feature set, or anything
else. What formats or features would make this useful for your workflow?
Top comments (0)