DEV Community

Cover image for Generating PDFs (invoices, manuals and more) from web pages using Puppeteer/Playwright
11 3

Generating PDFs (invoices, manuals and more) from web pages using Puppeteer/Playwright

This article was originally published on theheadless.dev

Puppeteer and Playwright can be used to create PDFs from webpages. This opens up interesting automation scenarios for tasks such as archiving, generating invoices, writing manuals, books and more.

This article introduces this functionality and shows how we can customise the PDF to fit our needs.

Generating a PDF file

After loading a page, we use the page.pdf() command to convert it to a PDF.

With Puppeteer:

const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch()
const page = await browser.newPage()
await page.goto('https://theheadless.dev/posts')
await page.pdf({ path: 'hd-posts.pdf' })
await browser.close()
})()

With Playwright:

const { chromium } = require('playwright');
(async () => {
const browser = await chromium.launch()
const page = await browser.newPage()
await page.goto('https://theheadless.dev/posts')
await page.pdf({ path: 'hd-posts.pdf' })
await browser.close()
})()

Note that we need to pass the path option to have the PDF file actually saved to disk.

WARNING:
This feature is currently only supported in Chromium headless in both Puppeteer and Playwright.

Tweaking the result

It is important to take a quick look at the official docs for page.pdf() (Puppeteer or Playwright), as it is almost certain that we will want to tweak the appearance of our page in the resulting PDF.

In certain cases, our webpage might look significantly different in our PDF compared to our browser. Depending on the case, it can pay off to experiment with the following:

  1. We might need to set option printBackground to true in case graphical components appear to be missing in the generated PDF.
  2. By default, page.pdf() will generate a PDF with adjusted colors for printing. Setting the CSS property -webkit-print-color-adjust: exact will force rendering of the original colors.
  3. Calling page.emulateMedia('screen') changes the CSS media type of the page.
  4. Setting either width and height or format to the appropriate value might be needed for the page to be displayed optimally.

Customising header and footer

We can also have custom headers and footers added to our pages, displaying values such as title, page number and more. Let's see how this looks on your favourite website:

With Puppeteer:

const puppeteer = require('puppeteer')
const fs = require('fs');
(async () => {
const browser = await puppeteer.launch()
const page = await browser.newPage()
const navigationPromise = page.waitForNavigation()
const templateHeader = fs.readFileSync('template-header.html', 'utf-8')
const templateFooter = fs.readFileSync('template-footer.html', 'utf-8')
await page.emulateMediaType('screen')
await page.goto('https://theheadless.dev/posts')
await navigationPromise
await page.waitForSelector('.accept', { visible: true })
await page.evaluate(() => document.querySelector('.accept').click())
await page.waitForSelector('.accept', { hidden: true })
await page.pdf({
path: 'hd-posts.pdf',
displayHeaderFooter: true,
headerTemplate: templateHeader,
footerTemplate: templateFooter,
margin: {
top: '100px',
bottom: '40px'
},
printBackground: true
})
await browser.close()
})()
view raw pdf-hd-puppeteer.js hosted with ❀ by GitHub

With Playwright:

const { chromium } = require('playwright')
const fs = require('fs');
(async () => {
const browser = await chromium.launch()
const page = await browser.newPage()
const navigationPromise = page.waitForNavigation()
const templateHeader = fs.readFileSync('template-header.html', 'utf-8')
const templateFooter = fs.readFileSync('template-footer.html', 'utf-8')
await page.goto('https://theheadless.dev/posts')
await navigationPromise
await page.waitForSelector('.accept', { visible: true })
await page.evaluate(() => document.querySelector('.accept').click())
await page.waitForSelector('.accept', { hidden: true })
await page.pdf({
path: 'hd-posts.pdf',
displayHeaderFooter: true,
headerTemplate: templateHeader,
footerTemplate: templateFooter,
margin: {
top: '100px',
bottom: '40px'
},
printBackground: true
})
await browser.close()
})()
view raw pdf-hd-playwright.js hosted with ❀ by GitHub

We are including the following template files for our header...

<html>
  <head>
    <style type="text/css">
      #header {
        padding: 0;
      }
      .content {
        width: 100%;
        background-color: #777;
        color: white;
        padding: 5px;
        -webkit-print-color-adjust: exact;
        vertical-align: middle;
        font-size: 15px;
        margin-top: 0;
        display: inline-block;
      }
      .title {
        font-weight: bold;
      }
      .date {
        text-align:right;
      }
    </style>
  </head>
  <body>
    <div class="content">
        <span class="title"></span> -
        <span class="date"></span>
        <span class="url"></div>
    </div>
  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

...and footer:

<html>
  <head>
    <style type="text/css">
      #footer {
        padding: 0;
      }
      .content-footer {
        width: 100%;
        background-color: #777;
        color: white;
        padding: 5px;
        -webkit-print-color-adjust: exact;
        vertical-align: middle;
        font-size: 15px;
        margin-top: 0;
        display: inline-block;
        text-align: center;
      }
    </style>
  </head>
  <body>
    <div class="content-footer">
      Page <span class="pageNumber"></span> of <span class="totalPages"></span>
    </div>
  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

The first page of the generated PDF looks as follows:

generated pdf screenshot example

TIP:
Chromium sets a default padding for header and footer. You will need to override it in your CSS.

Further considerations

We can easily transform existing web pages into PDF format, just as we have shown in our example. An even more interesting use case is about generating a brand new document: now we can use our existing HTML and CSS skills to produce high-quality PDFs, often eliminating the need for LaTeX or similar tools.

See points 2 and 3 of the following section for practical examples of this approach.

Further reading

  1. Pocket Admin's article on generating PDF from HTML.
  2. Florian Mâßle's guide to generating invoices with Puppeteer
  3. A great example of Puppeteer's PDF generation feature: Li Haoyi's Hands On Scala book. See the build pipeline behind it.

Banner image: "Students working with a printing press, Working Men's College" by State Library Victoria Collections is licensed under CC BY-NC 2.0

SurveyJS custom survey software

Simplify data collection in your JS app with a fully integrated form management platform. Includes support for custom question types, skip logic, integrated CCS editor, PDF export, real-time analytics & more. Integrates with any backend system, giving you full control over your data and no user limits.

Learn more

Top comments (1)

Collapse
 
lalami profile image
Salah Eddine Lalami β€’

Thanks for sharing, @ IDURAR , we use are using node.js react.js & redux

Here Tutorial about : πŸš€ Building and Generate Invoice PDF with React.js , Redux and Node.js : dev.to/idurar/building-an-invoice-...

 Building and Generate Invoice PDF with React.js , Redux and Node.js

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs