DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

How to Generate PDFs from HTML in Node.js: wkhtmltopdf vs Puppeteer vs API

You need to generate PDFs in your Node.js app. Invoices, reports, certificates, contracts.

You search "Node.js HTML to PDF" and find two solutions:

  1. wkhtmltopdf — "Lightweight, easy to set up"
  2. Puppeteer — "Just use Puppeteer, it can do anything"

You pick one. Three days later, you're debugging font rendering, broken CSS, memory leaks, and system dependencies.

There's a simpler way. Let me show you all three approaches so you can decide which one actually works.

The wkhtmltopdf Approach

wkhtmltopdf is a command-line tool that converts HTML to PDF using WebKit. Here's the basic setup:

# Install wkhtmltopdf
apt-get install wkhtmltopdf

# In Node.js
npm install wkhtmltopdf
Enter fullscreen mode Exit fullscreen mode

Then use it:

const wkhtmltopdf = require('wkhtmltopdf');
const fs = require('fs');

wkhtmltopdf('<h1>Hello World</h1>', (err, stream) => {
  if (err) return console.error(err);
  stream.pipe(fs.createWriteStream('invoice.pdf'));
});
Enter fullscreen mode Exit fullscreen mode

Looks simple. But this is where pain begins.

The Problems Start

1. System Dependencies

wkhtmltopdf requires a full WebKit rendering engine. On Linux, you need:

apt-get install wkhtmltopdf \
  fontconfig \
  fontconfig-config \
  fonts-liberation \
  fonts-noto-core \
  libjpeg-turbo-progs \
  libopenjp2-7 \
  libpng16-16 \
  libx11-6 \
  libxcb1 \
  libxext6 \
  libxrender1 \
  xfonts-encodings \
  xfonts-utils
Enter fullscreen mode Exit fullscreen mode

Your Docker image bloats to 800MB just for PDF generation.

2. Font Rendering Issues

wkhtmltopdf doesn't render custom fonts well. Your beautiful design in the browser becomes a serif mess in the PDF.

// Your HTML has this
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel="stylesheet">
<style>body { font-family: 'Inter', sans-serif; }</style>

// wkhtmltopdf renders it as Times New Roman
Enter fullscreen mode Exit fullscreen mode

To fix it, you need to install the fonts locally and reference them:

apt-get install fonts-inter fonts-roboto

# Then in HTML
<style>
  @font-face {
    font-family: 'Inter';
    src: url('file:///usr/share/fonts/opentype/inter/Inter-Regular.otf');
  }
</style>
Enter fullscreen mode Exit fullscreen mode

Now you're shipping fonts in your Docker image.

3. CSS Support Issues

wkhtmltopdf doesn't support modern CSS. Flexbox? Works sometimes. CSS Grid? Doesn't work. Media queries? Sort of works.

Your responsive HTML breaks when converted to PDF.

4. Memory & Concurrency

wkhtmltopdf spawns a WebKit process per PDF. Each process takes 100–200MB of RAM.

// If 10 people request PDFs simultaneously
// You need 1–2GB of RAM just for rendering
Enter fullscreen mode Exit fullscreen mode

In production, you hit memory limits. You need to implement a queue, rate limiting, or dedicated PDF servers.

The Puppeteer Approach

Puppeteer can generate PDFs, but it's overkill:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.setContent('<h1>Hello World</h1>');
  await page.pdf({ path: 'invoice.pdf' });
  await browser.close();
})();
Enter fullscreen mode Exit fullscreen mode

This "works," but Puppeteer was designed for automation, not PDF generation. You're paying the full cost of Chrome for a single feature.

Puppeteer PDF problems:

  • Chrome takes 300–500MB of RAM per instance
  • Cold starts take 2–3 seconds (user waits for PDF)
  • Requires same system dependencies as wkhtmltopdf
  • Memory leaks if not managed perfectly
  • PDF styling is slightly different from browser rendering
  • Slow for high-volume PDF generation

For a service that generates 1,000 PDFs/day, Puppeteer costs you a dedicated $400/month server just to avoid timeouts.

The API Approach

Here's PDF generation with an HTTP API:

const fetch = require('node-fetch');

async function generatePDF(htmlContent) {
  const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      html: htmlContent,
      format: 'A4',
      margin: '1cm'
    })
  });

  if (!response.ok) throw new Error(`PDF generation failed: ${response.status}`);
  return response.buffer();
}
Enter fullscreen mode Exit fullscreen mode

That's it. 13 lines including error handling. No system dependencies. No browser management.

Real-World Example: Invoice Generator

You're building a SaaS app that generates invoices as PDFs. Your customers request invoices on demand.

With wkhtmltopdf:

const express = require('express');
const wkhtmltopdf = require('wkhtmltopdf');
const Queue = require('bull');
const fs = require('fs');
const path = require('path');

const app = express();
const pdfQueue = new Queue('pdf-generation');

// Limit concurrent PDFs to 5 (avoid OOM)
pdfQueue.process(5, async (job) => {
  return new Promise((resolve, reject) => {
    const { html, invoiceId } = job.data;

    wkhtmltopdf(html, (err, stream) => {
      if (err) return reject(err);

      const filepath = path.join(__dirname, 'pdfs', `${invoiceId}.pdf`);
      stream.pipe(fs.createWriteStream(filepath))
        .on('finish', () => resolve(filepath))
        .on('error', reject);
    });
  });
});

app.post('/api/invoice/:id/pdf', async (req, res) => {
  const { id } = req.params;
  const invoice = await db.invoices.findOne({ id });

  if (!invoice) return res.status(404).json({ error: 'Not found' });

  const html = renderInvoiceHTML(invoice);

  try {
    const job = await pdfQueue.add({ html, invoiceId: id });
    const pdf = await job.finished();
    res.download(pdf);
  } catch (error) {
    res.status(500).json({ error: 'PDF generation failed' });
  }
});

app.listen(3000);
Enter fullscreen mode Exit fullscreen mode

That's a queue system, error handling, concurrency limits, file management. 60+ lines of infrastructure code.

With PageBolt API:

const express = require('express');
const fetch = require('node-fetch');

const app = express();

app.post('/api/invoice/:id/pdf', async (req, res) => {
  const { id } = req.params;
  const invoice = await db.invoices.findOne({ id });

  if (!invoice) return res.status(404).json({ error: 'Not found' });

  const html = renderInvoiceHTML(invoice);

  try {
    const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
      method: 'POST',
      headers: {
        'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        html,
        format: 'A4',
        margin: '1cm'
      })
    });

    if (!response.ok) throw new Error('PDF generation failed');

    res.setHeader('Content-Type', 'application/pdf');
    res.send(await response.buffer());
  } catch (error) {
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000);
Enter fullscreen mode Exit fullscreen mode

That's 35 lines. No queue. No memory management. No file storage. No infrastructure.

Feature Comparison

Feature wkhtmltopdf Puppeteer API
Setup time 1 hour 30 min 5 min
System dependencies Heavy (fonts, libs) Heavy (Chrome) None
Infrastructure cost $200/mo $400/mo $0–50/mo
PDF quality Good Excellent Excellent
Font support Limited Good Excellent
CSS support ~70% ~95% 100%
Concurrency Limited (~5) Limited (~10) Unlimited
Cold start latency 100ms 2–3s 200ms
Memory per PDF 50–100MB 300–500MB ~1MB
Custom headers/footers Yes Yes Yes
Margins & formatting Manual Manual Built-in

When to Use Each

Use wkhtmltopdf if:

  • You have low volume (< 100 PDFs/month)
  • You want full control and don't mind infrastructure overhead
  • You're running on-premise (no cloud calls allowed)

Use Puppeteer if:

  • You're already using Puppeteer for screenshots/automation
  • You need to interact with the page before generating PDF (click buttons, wait for JS)
  • You need absolute data residency (no external APIs)

Use an API if:

  • You want simplicity and reliability
  • You're generating PDFs at scale (1,000+/month)
  • You want to focus on your app, not infrastructure
  • You need CSS/font rendering that "just works"

Real-World Cost Comparison

At 10,000 PDFs/month:

Solution Infrastructure Ops Total
wkhtmltopdf $200 $1,000 $1,200
Puppeteer $400 $1,500 $1,900
API $0 $0 $29

The API approach is 40–65x cheaper.

Start Simple

Here's a production-ready Express endpoint for generating PDFs:

const express = require('express');
const fetch = require('node-fetch');
require('dotenv').config();

const app = express();
app.use(express.json());

async function generatePDF(htmlContent, options = {}) {
  const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      html: htmlContent,
      format: options.format || 'A4',
      margin: options.margin || '1cm',
      landscape: options.landscape || false
    }),
    timeout: 30000
  });

  if (!response.ok) {
    throw new Error(`PDF generation failed: ${response.status} ${response.statusText}`);
  }

  return response.buffer();
}

app.post('/api/generate-pdf', async (req, res) => {
  const { html, format, margin } = req.body;

  if (!html) {
    return res.status(400).json({ error: 'HTML content required' });
  }

  try {
    const pdfBuffer = await generatePDF(html, { format, margin });
    res.setHeader('Content-Type', 'application/pdf');
    res.send(pdfBuffer);
  } catch (error) {
    console.error('PDF generation error:', error);
    res.status(500).json({ error: error.message });
  }
});

app.listen(3000, () => console.log('PDF server running on port 3000'));
Enter fullscreen mode Exit fullscreen mode

Deploy this. It works. Move on to building your app.


Try PageBolt Free

100 requests/month. Generate PDFs without infrastructure.

Start your free trial — no credit card, no system dependencies, no headaches.

Top comments (0)