I am building a PDF API because every codebase I touch has a haunted PDF service

#api #webdev #saas #showdev

There is a file in almost every backend I have worked on. It generates PDFs. Invoices, mostly. Sometimes reports or certificates. It was written years ago by someone who left, it runs a headless browser nobody fully understands, and it falls over on the last day of the month when finance needs the invoices out.

I have rebuilt this same haunted corner enough times that I decided to fix it once, properly, for everyone. It is called PDFPipe.

The actual problem
Generating a PDF from HTML sounds trivial until you run it in production at any scale. The usual path is a headless Chromium (Puppeteer, Playwright, wkhtmltopdf, or Gotenberg). Then you discover:

Chromium wants 0.5 to 1 GB of RAM per instance and falls over under load.
Cold starts add seconds to every request after a deploy.
Custom fonts do not load, or load inconsistently.
Tables split across page breaks in ugly ways.
You now own a service that needs health checks, retries, and monitoring.
None of this is your product. It is undifferentiated infrastructure that breaks at the worst possible time.

What PDFPipe does
One endpoint. You send HTML or a template plus JSON data, you get back a PDF.

curl https://api.pdfpipe.xyz/v1/pdf \ -H "Authorization: Bearer pp_live_..." \ -d '{"html":"<h1>Invoice #4012</h1>","options":{"format":"A4"}}' \ --output invoice.pdf
That is the whole integration. No browser in your infra, no queue to babysit.

Three decisions I made on purpose
Flat pricing. A lot of PDF APIs price in credits where one large document burns several. I find that hostile. A document is a document.

A real free tier. The industry standard free tier is around 50 documents a month, which is not enough to ship anything. PDFPipe gives 500.

Security as a feature, not an afterthought. HTML-to-PDF services are a textbook SSRF target. Rendered markup can embed an iframe pointing at file:///etc/passwd or the cloud metadata endpoint at 169.254.169.254 and leak credentials straight into the returned PDF. There are CVEs for exactly this. PDFPipe renders in an isolated sandbox with no credentials and no route to private networks, and every outbound request a page makes is resolved and checked first.

Built for agents too
The newer reason this matters: AI agents are writing more and more application code, and they need tools they can call. PDFPipe ships an MCP server so an agent can generate a document in a single tool call. When your copilot adds invoicing to an app, the goal is for it to reach for this.

Where it is
Private beta opens to the waitlist in batches this summer. I am building it solo. If the haunted PDF file in your codebase sounds familiar, the waitlist is here:

https://pdfpipe.xyz

I would genuinely like to hear how you generate PDFs today and what breaks. Drop it in the comments.