Graham Sutton

Posted on May 6 • Edited on May 11

I Spent a Week Fighting HTML-to-PDF. Here’s What Finally Worked.

#webdev #programming #rust #saas

I’ve been building a property inspection platform with a friend of mine who’s an inspector. One of the requirements was deceptively simple:

Generate PDF reports.

That’s the deliverable inspectors send to insurance companies, banks, clients, etc. So the PDFs need to actually look good. Lots of photos, cover pages, tables of contents, structured sections, all that fun stuff.

My friend uses Spectora right now and honestly their PDFs were pretty good. So I figured, alright, let me build something similar.

I thought this would take maybe a day or two. It ended up taking me about a week and the PDF output is still brittle.

First, I started with:

wkhtmltopdf
Puppeteer
PDF libraries (DomPDF, mPDF, FPDF, etc)
CSS @print media query layouts

Nothing really ever worked right. Text would sometimes randomly overflow pages. I would constantly make an update to CSS and then generate a PDF and back and forth and back and forth. Headers and footers were also pain. One thing that drove me insane was that I wanted headers and footers on most pages, but not on the cover page or table of contents.

Sounds reasonable, right?

Turns out with a lot of these tools, it’s basically: headers everywhere or headers nowhere at all.

At one point I had CSS media queries inside of templates inside of rendering configs and I genuinely stopped understanding what controlled what anymore.

The thing that changed my perspective

I started googling around for alternatives. Eventually I stumbled across Carbone.

What caught my attention wasn’t even the API. It was the idea behind it.

Instead of trying to force HTML into behaving like a print layout engine, they just let you design documents in Word.

And honestly, that makes a ton of sense.

Word has spent decades solving pagination, print layout, margins, page breaks, etc. Meanwhile I was knee-deep in HTML hoping the **** document is going to render neatly this time.

I tried Carbone and the output was actually pretty good.

Then I got to the pricing and concurrency limits:

€29/month
1000 renders
only 2 concurrent documents

That was kind of the moment where I thought (where we have all thought):

“Honestly, I kinda want to try building this myself.”

So I built one in Rust

Originally this was not supposed to become a product.

I just wanted PDFs for our inspection app.

I built a very rough prototype in Rust using Handlebars-style templating and started experimenting with:

DOCX templating
image replacement
loops
conditionals
PDF conversion pipelines

The weird part is it actually started working surprisingly well.

Like, suspiciously well.

I expected this project to collapse under its own complexity almost immediately, but instead I kept adding features and the results kept getting better.

And because I was building it specifically for property inspection reports, I was stress testing it with exactly the kind of PDFs that normally become nightmares:

image-heavy documents
cover pages
tables
long narratives
repeated sections
inconsistent data

At some point I realized:

a) this was solving my own problem really well
b) other developers probably hate this problem too

So I turned it into a real thing.

The biggest thing I learned

The actual PDF rendering isn’t the hard part. The hard part is everything around it. Especially images.

If your document has 40 remote image URLs in it, that's 40 requests to download images that may be several MBs to even GBs, which if performed all at the same time, could run the instance out of memory during rendering.

The rendering itself can be fast, but if the server has to fetch dozens of images over the network first, that becomes the bottleneck very quickly.

I solved this by creating an buffer that parses for image URLs in your data and batches requests to pre-fetch them before rendering, which enables you to send those 40 images without crashing the instance.

Another optimization that makes a huge difference was simply allowing images to be passed directly as base64 instead of URLs. That forgoes the issue altogether and removes a bunch of network overhead entirely.

Right now I can render around 15 one-page PDFs/sec in production with simple templates (i.e. lots of data and <= 2 remote images).

Why I ended up making Rendrr

The more I worked on this, the more I realized others could benefit from this.

There are tons of HTML-to-PDF solutions.

But if you want:

truly WYSIWYG documents
Word-based templates
APIs
decent performance
sane pricing
concurrency that isn't intentionally limited

then there are surprisingly few options.

That's why I built Rendrr.

(Did I just self-promote? Yes. Yes, I did. Sue me. I work hard and how else is anyone else going to know it exists if I don't talk about it?)

Right now it’s still early access. I’m mostly focused on stability and making the core experience solid before I start piling on more features.

That said, I do have some ideas I’m excited about:

MCP server to generate and edit Word templates (there is a skill already available though that does the same thing!)
live previewing/editing in the dashboard
maybe PowerPoint templating eventually

We'll see what users demand first.

Anyway

This is my first SaaS that I've launched and would love to get some honest feedback about it. There's a free plan available. If you happen to hit your limit, feel free to reach out to me at graham@rendrr.io and I'll gladly up your limit for the small exchange of telling me what you like/don't like so far.

Thanks for reading!