DEV Community

Cover image for HTML to PDF - The Complete Guide for 2026
Digital Trubador
Digital Trubador

Posted on

HTML to PDF - The Complete Guide for 2026

HTML to PDF: The Complete Guide for 2026

Converting HTML to PDF is one of the most common requirements in modern web development. Whether you're building invoices, reports, receipts, or certificates, you need a reliable way to transform HTML into print-ready PDFs.

HTML-to-PDF conversion is the process of transforming HTML markup and CSS styles into a portable, print-ready PDF document, preserving layout, fonts, and images for offline viewing and distribution.

This comprehensive guide covers everything you need to know about HTML-to-PDF conversion in 2026, including the different approaches, common pitfalls, performance considerations, and how to choose the right solution for your needs.

Why HTML to PDF?

HTML is the universal language of the web. It offers:

  • Rich formatting: CSS for styling, layouts, and responsive design
  • Dynamic content: Template engines for variable data
  • Familiar tooling: Every developer knows HTML/CSS
  • Reusability: Same HTML can power web pages and PDFs

The challenge? Browsers render HTML to screens, not paper. Converting HTML to PDF requires specialized tools that understand print layouts, page breaks, and PDF specifications.

The Three Approaches

There are three main approaches to HTML-to-PDF conversion, each with tradeoffs:

1. Browser-Based (Headless Chrome/Chromium)

How it works: Launch a headless browser, load HTML, trigger print-to-PDF.

Tools: Puppeteer, Playwright, Selenium

Pros:

  • ✅ Excellent CSS support (same as Chrome)
  • ✅ JavaScript execution
  • ✅ Renders exactly like browser
  • ✅ Handles modern web features (flexbox, grid, animations)

Cons:

  • ❌ Slow (1-3 seconds per PDF)
  • ❌ Memory-intensive (100-300MB per instance)
  • ❌ Requires infrastructure (Docker, K8s)
  • ❌ Complex page break handling

Best for: Complex web layouts, modern CSS, JavaScript-heavy content

2. Library-Based (wkhtmltopdf, PrinceXML)

How it works: Specialized rendering engines that convert HTML to PDF without a full browser.

Tools: wkhtmltopdf, PrinceXML, WeasyPrint, pdfkit

Pros:

  • ✅ Faster than browsers (500ms-2s)
  • ✅ Lower memory usage
  • ✅ Better page break control (CSS Paged Media)
  • ✅ Designed for print

Cons:

  • ❌ Limited CSS support (especially modern features)
  • ❌ No JavaScript execution (mostly)
  • ❌ Inconsistent rendering vs browsers
  • ❌ Licensing costs (PrinceXML: $3,800+)

Best for: Print-focused documents, advanced page layout, when CSS Paged Media features are needed

3. API-Based (Managed Services)

How it works: Cloud-hosted services handle all infrastructure, scaling, and rendering.

Tools: LightningPDF, DocRaptor, PDFShift, CraftMyPDF

Pros:

  • ✅ Zero infrastructure management
  • ✅ Fast (sub-100ms with native engines)
  • ✅ Template marketplaces
  • ✅ Batch processing
  • ✅ Automatic scaling

Cons:

  • ❌ Recurring costs (vs one-time library purchase)
  • ❌ External dependency
  • ❌ Data leaves your infrastructure (API does not retain data after response)

Best for: Production applications, high volumes, teams wanting to focus on product not infrastructure

Comparison Table

Approach Speed Cost Maintenance CSS Support Scaling
Puppeteer ⭐⭐ (1-3s) Free* ⭐⭐ High ⭐⭐⭐⭐⭐ Full Manual
wkhtmltopdf ⭐⭐⭐ (500ms) Free ⭐⭐⭐ Low ⭐⭐ Limited Manual
PrinceXML ⭐⭐⭐ (1-2s) $3,800+ ⭐⭐⭐ Low ⭐⭐⭐⭐ Excellent Manual
LightningPDF ⭐⭐⭐⭐⭐ (<100ms) $0.01/doc ⭐⭐⭐⭐⭐ None ⭐⭐⭐⭐⭐ Full Automatic

*Free library, but infrastructure costs $200-500/month for production

Common Pitfalls and Solutions

1. Page Breaks

Problem: Content splits awkwardly across pages

Solution: Use CSS page break properties

/* Avoid page breaks inside elements */
.invoice-item {
    page-break-inside: avoid;
}

/* Force page break before element */
.new-section {
    page-break-before: always;
}

/* Control orphan/widow lines */
p {
    orphans: 3;
    widows: 3;
}
Enter fullscreen mode Exit fullscreen mode

2. Missing Fonts

Problem: PDFs show default fonts instead of custom fonts

Solution: Embed fonts with @font-face or use web-safe fonts

@font-face {
    font-family: 'CustomFont';
    src: url('https://yoursite.com/fonts/custom.woff2') format('woff2');
}

body {
    font-family: 'CustomFont', 'Arial', sans-serif;
}
Enter fullscreen mode Exit fullscreen mode

3. Images Not Loading

Problem: Images appear as broken in PDF

Solution: Use absolute URLs and ensure images load before PDF generation

<!-- Bad: Relative path -->
<img src="/images/logo.png">

<!-- Good: Absolute URL -->
<img src="https://yoursite.com/images/logo.png">

<!-- Better: Base64 embedded -->
<img src="data:image/png;base64,iVBORw0KG...">
Enter fullscreen mode Exit fullscreen mode

4. CSS Not Applied

Problem: Styles don't render in PDF

Solution: Use inline styles or embedded <style> tags

<!-- External stylesheets may not load -->
<link rel="stylesheet" href="/styles.css">

<!-- Embed styles directly -->
<style>
    body { font-family: Arial; }
    .header { background: #4F46E5; }
</style>
Enter fullscreen mode Exit fullscreen mode

5. Slow Generation

Problem: Each PDF takes 3-5 seconds to generate

Solution: Use native engines for simple documents

// Slow: Always use Chromium (1-3s)
const pdf = await page.pdf();

// Fast: Use native engine for invoices (<100ms)
const pdf = await lightningpdf.generate({
    template: 'invoice',
    engine: 'native' // Sub-100ms for simple layouts
});
Enter fullscreen mode Exit fullscreen mode

Code Examples

Puppeteer (Node.js)

const puppeteer = require('puppeteer');

async function generatePDF(html) {
    const browser = await puppeteer.launch({
        headless: true,
        args: ['--no-sandbox']
    });

    const page = await browser.newPage();
    await page.setContent(html, { waitUntil: 'networkidle0' });

    const pdf = await page.pdf({
        format: 'A4',
        printBackground: true,
        margin: { top: '20px', right: '20px', bottom: '20px', left: '20px' }
    });

    await browser.close();
    return pdf;
}

// Usage
const html = '<h1>Hello PDF</h1>';
const pdf = await generatePDF(html);
Enter fullscreen mode Exit fullscreen mode

Performance: 2-3 seconds, 200MB memory

wkhtmltopdf (Shell)

# Install
apt-get install wkhtmltopdf

# Generate PDF
wkhtmltopdf \
    --page-size A4 \
    --margin-top 20mm \
    input.html output.pdf
Enter fullscreen mode Exit fullscreen mode

Performance: 500ms-1s, 50MB memory

LightningPDF (Any Language)

# cURL
curl https://lightningpdf.dev/api/v1/pdf/generate \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"html": "<h1>Hello PDF</h1>"}' \
  -o output.pdf
Enter fullscreen mode Exit fullscreen mode
// JavaScript
const response = await fetch('https://lightningpdf.dev/api/v1/pdf/generate', {
    method: 'POST',
    headers: {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
    },
    body: JSON.stringify({ html: '<h1>Hello PDF</h1>' })
});
const pdf = await response.buffer();
Enter fullscreen mode Exit fullscreen mode
# Python
import requests

response = requests.post(
    'https://lightningpdf.dev/api/v1/pdf/generate',
    headers={'Authorization': 'Bearer YOUR_API_KEY'},
    json={'html': '<h1>Hello PDF</h1>'}
)
pdf = response.content
Enter fullscreen mode Exit fullscreen mode

Performance: <100ms for simple HTML (native engine), 10 lines of code

Advanced Techniques

Headers and Footers

<style>
    @page {
        margin: 2cm;
        @top-center {
            content: "Company Report";
        }
        @bottom-right {
            content: "Page " counter(page) " of " counter(pages);
        }
    }
</style>
Enter fullscreen mode Exit fullscreen mode

Conditional Page Breaks

/* Keep tables together */
table {
    page-break-inside: avoid;
}

/* Start chapters on new page */
.chapter {
    page-break-before: always;
}

/* Prevent lonely headings */
h2, h3 {
    page-break-after: avoid;
}
Enter fullscreen mode Exit fullscreen mode

Responsive Print Layouts

@media print {
    /* Hide navigation in PDFs */
    nav { display: none; }

    /* Adjust colors for print */
    body { color: #000; background: #fff; }

    /* Show URLs for links */
    a:after { content: " (" attr(href) ")"; }
}
Enter fullscreen mode Exit fullscreen mode

Performance Optimization

1. Minimize HTML Size

// Bad: Lots of unused CSS
<link rel="stylesheet" href="bootstrap.min.css">

// Good: Only styles you need
<style>
    .invoice { font-family: Arial; }
</style>
Enter fullscreen mode Exit fullscreen mode

2. Preload Images

// Wait for images to load before generating PDF
await page.evaluate(() => {
    return Promise.all(
        Array.from(document.images)
            .map(img => img.complete ? Promise.resolve() :
                new Promise(resolve => { img.onload = resolve; }))
    );
});
Enter fullscreen mode Exit fullscreen mode

3. Reuse Browser Instances

// Bad: Launch new browser for each PDF (slow)
for (const html of documents) {
    const browser = await puppeteer.launch();
    await generatePDF(browser, html);
    await browser.close(); // 3s overhead per PDF
}

// Good: Reuse browser (fast)
const browser = await puppeteer.launch();
for (const html of documents) {
    await generatePDF(browser, html);
}
await browser.close();
Enter fullscreen mode Exit fullscreen mode

4. Use Native Engines for Simple Documents

// Slow: Chromium for everything (1-3s each)
const invoicePDF = await chromium.generate(invoiceHTML);
const receiptPDF = await chromium.generate(receiptHTML);

// Fast: Native engine for simple docs (<100ms each)
const invoicePDF = await native.generate(invoiceHTML);   // 80ms
const receiptPDF = await native.generate(receiptHTML);   // 75ms
Enter fullscreen mode Exit fullscreen mode

Choosing the Right Solution

Use Puppeteer/Playwright if you:

  • Need exact browser rendering
  • Already have infrastructure
  • Generate <100 PDFs/day
  • Have JavaScript-heavy content
  • Need full control over rendering

Use wkhtmltopdf if you:

  • Need simple, fast conversion
  • Don't require modern CSS
  • Have static content
  • Want free, open-source solution
  • Can tolerate rendering differences

Use PrinceXML if you:

  • Need advanced print features (running headers, footnotes)
  • Generate professional publications (books, magazines)
  • Require PDF/A compliance
  • Have budget for commercial license

Use LightningPDF if you:

  • Generate invoices, receipts, reports at scale
  • Want sub-100ms generation
  • Need template marketplace
  • Want batch processing (1000s of PDFs per call)
  • Prefer managed service over self-hosting
  • Need predictable $0.01/doc pricing

Real-World Use Case: E-commerce Invoices

Requirement: Generate 10,000 invoices/month

Option 1: Puppeteer (Self-Hosted)

Cost: $300/month infrastructure + 40 hours setup
Time: 2s × 10,000 = 5.5 hours/month
Code: 500+ lines (API, queue, storage, scaling)
Maintenance: 5 hours/month
Enter fullscreen mode Exit fullscreen mode

Option 2: LightningPDF

Cost: $29/month (Pro plan)
Time: 0.08s × 10,000 = 13 minutes/month
Code: 10 lines
Maintenance: 0 hours (managed service)
Enter fullscreen mode Exit fullscreen mode

Winner: LightningPDF saves $270/month and 45 hours/month while being 25x faster.

Security Considerations

Input Sanitization

// Sanitize user-provided HTML
const sanitizeHTML = require('sanitize-html');

const cleanHTML = sanitizeHTML(userHTML, {
    allowedTags: ['h1', 'h2', 'p', 'strong', 'em', 'table'],
    allowedAttributes: { 'td': ['colspan'], 'th': ['colspan'] }
});

const pdf = await generatePDF(cleanHTML);
Enter fullscreen mode Exit fullscreen mode

Resource Limits

// Prevent malicious HTML from consuming resources
await page.setContent(html, {
    timeout: 5000, // 5s max
    waitUntil: 'domcontentloaded' // Don't wait for everything
});
Enter fullscreen mode Exit fullscreen mode

Sandboxing

// Run Chromium in sandbox mode
const browser = await puppeteer.launch({
    args: [
        '--no-sandbox',
        '--disable-setuid-sandbox',
        '--disable-dev-shm-usage'
    ]
});
Enter fullscreen mode Exit fullscreen mode

Testing Your PDFs

Visual Regression Testing

const { toMatchImageSnapshot } = require('jest-image-snapshot');
expect.extend({ toMatchImageSnapshot });

test('invoice renders correctly', async () => {
    const pdf = await generatePDF(invoiceHTML);
    const image = await convertPDFToImage(pdf);
    expect(image).toMatchImageSnapshot();
});
Enter fullscreen mode Exit fullscreen mode

Content Validation

const pdfParse = require('pdf-parse');

test('invoice contains correct data', async () => {
    const pdf = await generateInvoice({ total: 1000 });
    const data = await pdfParse(pdf);
    expect(data.text).toContain('$1,000.00');
    expect(data.text).toContain('Invoice #');
});
Enter fullscreen mode Exit fullscreen mode

Conclusion

HTML-to-PDF conversion in 2026 offers multiple approaches:

  • Browser-based (Puppeteer): Best for complex layouts, JavaScript-heavy content
  • Library-based (wkhtmltopdf): Good for simple, static content
  • API-based (LightningPDF): Best for production apps wanting speed, templates, and zero maintenance

For 95% of modern applications, an API like LightningPDF is the right choice:

  • 10-25x faster than DIY
  • Template marketplace (save 10+ hours per template)
  • Batch processing
  • Zero infrastructure management
  • Predictable $0.01/doc pricing

Ready to start? Try LightningPDF free — 50 PDFs/month, full template marketplace, no credit card required.

Frequently Asked Questions

What is the best way to convert HTML to PDF?

For most production applications, a managed API like LightningPDF is the best way to convert HTML to PDF. It offers sub-100ms generation for simple documents, full CSS support including flexbox and grid, zero infrastructure management, and predictable per-document pricing starting at $0.003 each.

Is Puppeteer good for HTML to PDF conversion?

Puppeteer provides excellent CSS support since it uses real Chrome rendering, but it is slow at 1 to 3 seconds per PDF, requires significant infrastructure with 100 to 300 megabytes of memory per instance, and needs ongoing maintenance. It works well for low-volume use cases under 100 PDFs per day but becomes expensive to scale.

How do I fix page breaks in HTML to PDF?

Use CSS properties like break-inside avoid on table rows and cards, break-after avoid on headings, and set orphans and widows to 3 on paragraphs. LightningPDF also offers an automatic page-break-mode that handles these issues without manual CSS, including repeating table headers across pages.

How fast can HTML be converted to PDF?

Speed varies dramatically by approach. LightningPDF's native engine converts HTML to PDF in under 100 milliseconds for structured documents like invoices. Browser-based tools like Puppeteer take 1 to 3 seconds, wkhtmltopdf takes 500 milliseconds to 1 second, and PrinceXML takes 1 to 2 seconds per document.

Do I need a headless browser to convert HTML to PDF?

No, you do not need a headless browser. LightningPDF's native Go engine converts HTML to PDF without any browser, achieving sub-100ms speeds. For complex layouts requiring full browser rendering, the API handles Chromium internally so you never need to manage browser instances yourself.

Related Reading

Additional Resources

Top comments (0)