DEV Community

Cover image for How to convert HTML to PDF using Puppeteer
Chris
Chris

Posted on • Updated on • Originally published at html2pdf.app

How to convert HTML to PDF using Puppeteer

Whether you're looking to generate invoices, create reports, or preserve web content for offline use, Puppeteer is a powerful tool that can automate the process seamlessly. In this guide, we will explore how to convert HTML to PDF using Puppeteer, an open-source Node.js library developed by Google.

Setting Up Puppeteer

To get started, you'll need to have Node.js installed on your machine. Open your terminal or command prompt and create a new directory for your project. Navigate into the project directory and initialize a new Node.js project by running the following command:

npm init -y
Enter fullscreen mode Exit fullscreen mode

Next, install Puppeteer as a dependency by executing the following command:

npm install puppeteer
Enter fullscreen mode Exit fullscreen mode

Puppeteer will now be added to your project, allowing you to programmatically control a headless Chrome or Chromium browser.

Writing the conversion script

Create a new JavaScript file, such as convert-html-to-pdf.js, in your project directory. Open the file in your preferred text editor and begin by importing the Puppeteer library:

const puppeteer = require('puppeteer');
Enter fullscreen mode Exit fullscreen mode

Initializing Puppeteer and Converting HTML to PDF

Inside the convert-html-to-pdf.js file, add the following code to initialize Puppeteer and convert the HTML to PDF:

(async () => {
  try {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('file:///path/to/your/html/file.html', { waitUntil: 'networkidle0' });

    // To use normal CSS instead of only print styles
    await page.emulateMediaType('screen');

    await page.pdf({ path: 'output.pdf', format: 'A4' });

      console.log('Conversion complete. PDF file generated successfully.');

      await browser.close();
    } catch (error) {
      console.error('An error occurred:', error);
    }

})();
Enter fullscreen mode Exit fullscreen mode

The example shows how to convert an existing html file to the pdf, but you might be needed to convert html code which could be passed as a parameter to the puppeteer. In this case function page.setContent() should be used. Check the example below:

const html = '<html...';
await page.setContent(html, { waitUntil: 'networkidle0' });
Enter fullscreen mode Exit fullscreen mode

Converting URL to PDF

You can easily update the code to convert an URL instead of HTML or file, by passing the URL to the page.goto method:

await page.goto('https://example.com', { waitUntil: 'networkidle0' });
Enter fullscreen mode Exit fullscreen mode

It is important to understand waitUntil parameter and it's values to choose a proper one.

  • load: when load event is fired.
  • domcontentloaded: when the DOMContentLoaded event is fired.
  • networkidle0: when there are no more than 0 network connections for at least 500 ms.
  • networkidle2: when there are no more than 2 network connections for at least 500 ms.

For more detailed information, which options can be passed to the puppeteer.goto method check the puppeteer documentation.

Break down the code

We have created an async function using an immediately invoked function expression (IIFE) to ensure proper execution. Inside the function, we launched a new instance of the Puppeteer controlled browser.

We open a new page and navigate to the desired HTML file using the page.goto() method. Make sure to replace 'file:///path/to/your/html/file.html' with the actual path to your HTML file. By specifying { waitUntil: 'networkidle0' }, we ensure that the page is fully loaded before generating the PDF.

It is necessary to execute method: page.emulateMediaType('screen'); to have fully loaded CSS.

Finally, we call page.pdf() to convert the HTML to PDF and specify the output path and format. Adjust the path and format properties as needed. After successful conversion, we log a message and close the browser instance.

Running the conversion script

Save the convert-html-to-pdf.js file, navigate to your project directory in the terminal or command prompt, and run the following command:

node convert-html-to-pdf.js
Enter fullscreen mode Exit fullscreen mode

Puppeteer will launch a headless browser, load the HTML file, convert it to PDF, and save the output as output.pdf in the project directory.

Using html2pdf.app API

Managing Puppeteer can be chalanging task, especially if you need to scale up the application. Huge number of the requests can break the server performance since PDF conversion requires quite many resources.

Make your life easier by using html to pdf API to convert PDFs.

Following code shows how it is simple todo with our api:

import axios from 'axios';
import fs from 'fs';

axios.post('https://api.html2pdf.app/v1/generate', {
  html: 'https://example.com',
  apiKey: '{your-api-key}',
}, {responseType: 'arraybuffer'}).then((response) => {
  fs.writeFileSync('./document.pdf', response.data);
}).catch((err) => {
  console.log(err.message);
});
Enter fullscreen mode Exit fullscreen mode

Conclusion

By following the steps outlined in this guide, you can easily convert HTML files to PDF using Puppeteer. This versatile library opens up a world of possibilities for automating document generation, report creation, and more.

If you do not want to struggle with some edge cases and have a fast result, try already built in solutions to convert PDFs like html2pdf.app.

Happy coding!

Top comments (0)