DEV Community

Cover image for How to Build a Web Page to PDF Converter and Not Lose Your Mind
Boris Zabolotskikh
Boris Zabolotskikh

Posted on

How to Build a Web Page to PDF Converter and Not Lose Your Mind

Have you ever wanted to save an article as a PDF without all the extra junk — just clean text?

Or save only a specific part of a page?

And have everything on one long page, without page breaks?

Yeah. Same here.

So I decided to build my own solution and turn it into a browser extension.

What’s wrong with the standard CTRL+P?

When I save something as a PDF, I want it to look exactly the way it looks on the screen.

I want clickable links.

I want the whole document on one long page, without breaks.

CTRL+P can’t do that. It’s good only for printing on paper.

Which is exactly what it was originally designed for.

Everything else it simply doesn’t know how to do.

Available modes in the extension

Full page

Full page mode

Saves the entire page exactly as you see it on the screen.

The text stays selectable, and links remain clickable.

This is not an image with OCR text.

It’s a real PDF with real text and real links.

Export a page element

Export a page element mode

In this mode, you can select a specific element on the page and export only that.

This is very handy when you want to save just an article, just a code block, or any other element.

I often use this mode when submitting Google Forms and want to save the filled-in data, so I don’t forget what I sent.

Article

One of the most interesting modes.

It lets you save blog articles in a reader-friendly view. Nothing extra on the page — just the article itself.

In this mode, I made sure that:

  • code blocks wrap lines properly
  • <details> tags are expanded

So the entire article content ends up in the PDF.

Remove elements

Remove elements mode

When this mode is enabled, you can click on any element on the page and remove it.

For example: sidebars, menus, ads, and so on.

If you accidentally remove something important, just press CTRL+Z to undo.

Export chats from ChatGPT, DeepSeek, and Gemini

Since AI companies don’t really want to add PDF export for chats, I did it in my extension. Why not?

It’s very convenient — one click and you save the currently open conversation.

In every mode, you can choose the layout:

  • single-page PDF
  • multi-page PDF

Depending on whether you want to print it or just read it without page breaks.

You can also adjust the layout to match the screen size or standard formats like A4, A5, and so on.

How not to lose your mind

Too many layout variations

Saving websites to PDF is hard.

It’s impossible to account for every layout variation. Something will always break somewhere.

At first, I tried fixing issues site by site, but quickly realized this was a fight against windmills.

So now I only add special fixes for large platforms like Notion.

Atomic CSS

Because of the massive adoption of atomic CSS frameworks like Tailwind, it’s hard to reliably select elements for export on some sites.

For example, exporting a ChatGPT conversation required quite a bit of work and some tricky selectors just to grab the dialog element.

With Claude Code, I gave up entirely because of the layout.

Gemini was the easiest — the class names were actually human-readable.

Lazy-loading

Lazy-loaded images are a whole separate kind of pain.

Imagine a long page with tons of images. To make sure all images load and end up in the PDF, the best solution I found was this:

  • loop through all images
  • wait ~100 ms near each one
  • move on to the next

Just to trigger image loading.

If you have a more elegant solution — I’d love to hear it in the comments.

Restoring page styles

The “export page element” mode caused problems too.

To export a single element, I hide everything else on the page except that element.

Then I restore all styles back — without reloading the page.

Reloading is not an option.

If you filled out a long form and want to save it as a PDF, and the page reloads in the process — the rage will be real.

I know, I’ve been there.

No documentation

Rendering the final PDF is a whole story on its own.

There are no good libraries for this task.

There’s the well-known PDF.js by Mozilla, which includes PDFViewer. Sounds great — except for one thing: there’s basically no documentation.

You have to figure everything out by reading source code and GitHub issues.

There are tons of issues and discussions asking for documentation, but Mozilla doesn’t want to do it.

Fair enough. At least thanks for the library.

Under the hood

Here are the main tools I use:

  • WXT.dev — a framework for building browser extensions. In my opinion, the best option right now: fast HMR, separate browser for testing, great dev speed.

  • PDFViewer — for rendering PDFs. It’s painful, but there are no real alternatives.

  • Chrome Debugger — for converting pages to PDF. Sounds weird at first, but it gives the highest final PDF quality.

What’s next

I keep adding integrations with large websites.

For example, I recently added one-click saving for Reddit posts.

Besides the extension, I also built a web service for saving pages as PDFs. It doesn’t have as many features as the extension, but it works on mobile devices.

Top comments (0)