DEV Community

Graham Sutton
Graham Sutton

Posted on

I Spent a Week Fighting HTML-to-PDF. Here’s What Finally Worked.

I’ve been building a property inspection platform with a friend of mine who’s an inspector. One of the requirements was deceptively simple:

Generate PDF reports.

That’s the deliverable inspectors send to insurance companies, banks, clients, etc. So the PDFs need to actually look good. Lots of photos, cover pages, tables of contents, structured sections, all that fun stuff.

My friend uses Spectora right now and honestly their PDFs were pretty good. So I figured, alright, let me build something similar.

I thought this would take maybe a day or two.

I was wrong.

The HTML-to-PDF rabbit hole

I started where everyone starts:

  • wkhtmltopdf
  • headless Chrome
  • random PDF libraries
  • CSS print layouts
  • "surely this StackOverflow answer will solve it"

Every single path turned into pain.

Stuff would randomly overflow pages. CSS that worked perfectly in the browser suddenly looked completely different in the PDF. Headers and footers were especially awful.

One thing that drove me insane:

I wanted headers and footers on most pages, but not on the cover page or table of contents.

Sounds reasonable, right?

Turns out with a lot of these tools, it’s basically:

  • headers everywhere
  • or headers nowhere

And then you end up doing weird hacks trying to overlay white rectangles over content or splitting documents apart and merging them afterward like some kind of PDF necromancer.

At one point I had CSS media queries inside of templates inside of rendering configs and I genuinely stopped understanding what controlled what anymore.

The thing that changed my perspective

Eventually I stumbled across Carbone.

What caught my attention wasn’t even the API. It was the idea behind it.

Instead of trying to force HTML into behaving like a print layout engine, they just let you design documents in Word.

And honestly, that makes a ton of sense.

Word has spent decades solving pagination, print layout, margins, page breaks, etc. Meanwhile I was knee-deep in HTML hoping the **** document is going to render neatly this time.

I tried Carbone and the output was actually pretty good.

Then I got to the pricing and concurrency limits:

  • €29/month
  • 1000 renders
  • only 2 concurrent documents

That was kind of the moment where I thought (where we have all thought):

“Honestly, I kinda want to try building this myself.”

So I built one in Rust

Originally this was not supposed to become a product.

I just wanted PDFs for our inspection app.

I built a very rough prototype in Rust using Handlebars-style templating and started experimenting with:

  • DOCX templating
  • image replacement
  • loops
  • conditionals
  • PDF conversion pipelines

The weird part is it actually started working surprisingly well.

Like, suspiciously well.

I expected this project to collapse under its own complexity almost immediately, but instead I kept adding features and the results kept getting better.

And because I was building it specifically for property inspection reports, I was stress testing it with exactly the kind of PDFs that normally become nightmares:

  • image-heavy documents
  • cover pages
  • tables
  • long narratives
  • repeated sections
  • inconsistent data

At some point I realized:

  1. this was solving my own problem really well
  2. other developers probably hate this problem too

So I turned it into a real thing.

The biggest thing I learned

The actual PDF rendering isn’t the hard part.

The hard part is everything around it.

Especially images.

If your document has 40 remote image URLs in it, congratulations, your renderer is now partially a network orchestration engine.

The rendering itself can be fast, but if the server has to fetch dozens of images over the network first, that becomes the bottleneck very quickly.

One optimization that makes a huge difference was allowing images to be passed directly as base64 instead of URLs. That removes a bunch of network overhead entirely.

Right now I can render around 15 one-page PDFs/sec in production if images aren’t the bottleneck.

Why I ended up making Rendrr

The more I worked on this, the more I realized there’s this weird gap in the ecosystem.

There are tons of HTML-to-PDF solutions.

But if you want:

  • truly WYSIWYG documents
  • Word-based templates
  • APIs
  • decent performance
  • sane pricing
  • concurrency that isn't intentionally limited

then there are surprisingly few options.

So I built Rendrr.

(Did I just self-promote? Yes. Yes, I did. Sue me. I work hard and how else is anyone else going to know it exists if I don't talk about it?)

Right now it’s still early access. I’m mostly focused on stability and making the core experience solid before I start piling on more features.

That said, I do have some ideas I’m excited about:

Still figuring out where to take it.

Anyway

If you’ve fought with PDF generation before, I’d genuinely love to hear what your experience was like because this entire project basically came from me repeatedly asking:

“Why is this still so painful?”

Top comments (0)