You need to generate PDFs in your Node.js app. Invoices, reports, certificates, contracts.
You search "Node.js HTML to PDF" and find two solutions:
- wkhtmltopdf — "Lightweight, easy to set up"
- Puppeteer — "Just use Puppeteer, it can do anything"
You pick one. Three days later, you're debugging font rendering, broken CSS, memory leaks, and system dependencies.
There's a simpler way. Let me show you all three approaches so you can decide which one actually works.
The wkhtmltopdf Approach
wkhtmltopdf is a command-line tool that converts HTML to PDF using WebKit. Here's the basic setup:
# Install wkhtmltopdf
apt-get install wkhtmltopdf
# In Node.js
npm install wkhtmltopdf
Then use it:
const wkhtmltopdf = require('wkhtmltopdf');
const fs = require('fs');
wkhtmltopdf('<h1>Hello World</h1>', (err, stream) => {
if (err) return console.error(err);
stream.pipe(fs.createWriteStream('invoice.pdf'));
});
Looks simple. But this is where pain begins.
The Problems Start
1. System Dependencies
wkhtmltopdf requires a full WebKit rendering engine. On Linux, you need:
apt-get install wkhtmltopdf \
fontconfig \
fontconfig-config \
fonts-liberation \
fonts-noto-core \
libjpeg-turbo-progs \
libopenjp2-7 \
libpng16-16 \
libx11-6 \
libxcb1 \
libxext6 \
libxrender1 \
xfonts-encodings \
xfonts-utils
Your Docker image bloats to 800MB just for PDF generation.
2. Font Rendering Issues
wkhtmltopdf doesn't render custom fonts well. Your beautiful design in the browser becomes a serif mess in the PDF.
// Your HTML has this
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;600&display=swap" rel="stylesheet">
<style>body { font-family: 'Inter', sans-serif; }</style>
// wkhtmltopdf renders it as Times New Roman
To fix it, you need to install the fonts locally and reference them:
apt-get install fonts-inter fonts-roboto
# Then in HTML
<style>
@font-face {
font-family: 'Inter';
src: url('file:///usr/share/fonts/opentype/inter/Inter-Regular.otf');
}
</style>
Now you're shipping fonts in your Docker image.
3. CSS Support Issues
wkhtmltopdf doesn't support modern CSS. Flexbox? Works sometimes. CSS Grid? Doesn't work. Media queries? Sort of works.
Your responsive HTML breaks when converted to PDF.
4. Memory & Concurrency
wkhtmltopdf spawns a WebKit process per PDF. Each process takes 100–200MB of RAM.
// If 10 people request PDFs simultaneously
// You need 1–2GB of RAM just for rendering
In production, you hit memory limits. You need to implement a queue, rate limiting, or dedicated PDF servers.
The Puppeteer Approach
Puppeteer can generate PDFs, but it's overkill:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent('<h1>Hello World</h1>');
await page.pdf({ path: 'invoice.pdf' });
await browser.close();
})();
This "works," but Puppeteer was designed for automation, not PDF generation. You're paying the full cost of Chrome for a single feature.
Puppeteer PDF problems:
- Chrome takes 300–500MB of RAM per instance
- Cold starts take 2–3 seconds (user waits for PDF)
- Requires same system dependencies as wkhtmltopdf
- Memory leaks if not managed perfectly
- PDF styling is slightly different from browser rendering
- Slow for high-volume PDF generation
For a service that generates 1,000 PDFs/day, Puppeteer costs you a dedicated $400/month server just to avoid timeouts.
The API Approach
Here's PDF generation with an HTTP API:
const fetch = require('node-fetch');
async function generatePDF(htmlContent) {
const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
html: htmlContent,
format: 'A4',
margin: '1cm'
})
});
if (!response.ok) throw new Error(`PDF generation failed: ${response.status}`);
return response.buffer();
}
That's it. 13 lines including error handling. No system dependencies. No browser management.
Real-World Example: Invoice Generator
You're building a SaaS app that generates invoices as PDFs. Your customers request invoices on demand.
With wkhtmltopdf:
const express = require('express');
const wkhtmltopdf = require('wkhtmltopdf');
const Queue = require('bull');
const fs = require('fs');
const path = require('path');
const app = express();
const pdfQueue = new Queue('pdf-generation');
// Limit concurrent PDFs to 5 (avoid OOM)
pdfQueue.process(5, async (job) => {
return new Promise((resolve, reject) => {
const { html, invoiceId } = job.data;
wkhtmltopdf(html, (err, stream) => {
if (err) return reject(err);
const filepath = path.join(__dirname, 'pdfs', `${invoiceId}.pdf`);
stream.pipe(fs.createWriteStream(filepath))
.on('finish', () => resolve(filepath))
.on('error', reject);
});
});
});
app.post('/api/invoice/:id/pdf', async (req, res) => {
const { id } = req.params;
const invoice = await db.invoices.findOne({ id });
if (!invoice) return res.status(404).json({ error: 'Not found' });
const html = renderInvoiceHTML(invoice);
try {
const job = await pdfQueue.add({ html, invoiceId: id });
const pdf = await job.finished();
res.download(pdf);
} catch (error) {
res.status(500).json({ error: 'PDF generation failed' });
}
});
app.listen(3000);
That's a queue system, error handling, concurrency limits, file management. 60+ lines of infrastructure code.
With PageBolt API:
const express = require('express');
const fetch = require('node-fetch');
const app = express();
app.post('/api/invoice/:id/pdf', async (req, res) => {
const { id } = req.params;
const invoice = await db.invoices.findOne({ id });
if (!invoice) return res.status(404).json({ error: 'Not found' });
const html = renderInvoiceHTML(invoice);
try {
const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
html,
format: 'A4',
margin: '1cm'
})
});
if (!response.ok) throw new Error('PDF generation failed');
res.setHeader('Content-Type', 'application/pdf');
res.send(await response.buffer());
} catch (error) {
res.status(500).json({ error: error.message });
}
});
app.listen(3000);
That's 35 lines. No queue. No memory management. No file storage. No infrastructure.
Feature Comparison
| Feature | wkhtmltopdf | Puppeteer | API |
|---|---|---|---|
| Setup time | 1 hour | 30 min | 5 min |
| System dependencies | Heavy (fonts, libs) | Heavy (Chrome) | None |
| Infrastructure cost | $200/mo | $400/mo | $0–50/mo |
| PDF quality | Good | Excellent | Excellent |
| Font support | Limited | Good | Excellent |
| CSS support | ~70% | ~95% | 100% |
| Concurrency | Limited (~5) | Limited (~10) | Unlimited |
| Cold start latency | 100ms | 2–3s | 200ms |
| Memory per PDF | 50–100MB | 300–500MB | ~1MB |
| Custom headers/footers | Yes | Yes | Yes |
| Margins & formatting | Manual | Manual | Built-in |
When to Use Each
Use wkhtmltopdf if:
- You have low volume (< 100 PDFs/month)
- You want full control and don't mind infrastructure overhead
- You're running on-premise (no cloud calls allowed)
Use Puppeteer if:
- You're already using Puppeteer for screenshots/automation
- You need to interact with the page before generating PDF (click buttons, wait for JS)
- You need absolute data residency (no external APIs)
Use an API if:
- You want simplicity and reliability
- You're generating PDFs at scale (1,000+/month)
- You want to focus on your app, not infrastructure
- You need CSS/font rendering that "just works"
Real-World Cost Comparison
At 10,000 PDFs/month:
| Solution | Infrastructure | Ops | Total |
|---|---|---|---|
| wkhtmltopdf | $200 | $1,000 | $1,200 |
| Puppeteer | $400 | $1,500 | $1,900 |
| API | $0 | $0 | $29 |
The API approach is 40–65x cheaper.
Start Simple
Here's a production-ready Express endpoint for generating PDFs:
const express = require('express');
const fetch = require('node-fetch');
require('dotenv').config();
const app = express();
app.use(express.json());
async function generatePDF(htmlContent, options = {}) {
const response = await fetch('https://api.pagebolt.io/api/v1/generate-pdf', {
method: 'POST',
headers: {
'Authorization': `Bearer ${process.env.PAGEBOLT_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
html: htmlContent,
format: options.format || 'A4',
margin: options.margin || '1cm',
landscape: options.landscape || false
}),
timeout: 30000
});
if (!response.ok) {
throw new Error(`PDF generation failed: ${response.status} ${response.statusText}`);
}
return response.buffer();
}
app.post('/api/generate-pdf', async (req, res) => {
const { html, format, margin } = req.body;
if (!html) {
return res.status(400).json({ error: 'HTML content required' });
}
try {
const pdfBuffer = await generatePDF(html, { format, margin });
res.setHeader('Content-Type', 'application/pdf');
res.send(pdfBuffer);
} catch (error) {
console.error('PDF generation error:', error);
res.status(500).json({ error: error.message });
}
});
app.listen(3000, () => console.log('PDF server running on port 3000'));
Deploy this. It works. Move on to building your app.
Try PageBolt Free
100 requests/month. Generate PDFs without infrastructure.
Start your free trial — no credit card, no system dependencies, no headaches.
Top comments (0)