In September 2019 I was contacted by a company to build a report api. This company is building a product to measure well-being and levels of stress in organizations by sending out surveys to employees.
Some of the company's clients requested a feature to generate pdf reports based on these surveys. Each survey includes a number of questions with data over multiple periods which will be displayed in charts. Chart data can be displayed in two ways: survey data for periods over time and a summary of all periods.
I was given pretty much free reigns in how to solve the problem, here's how I did it.
Note: The code snippets are trimmed for brevity and does not represent real data or architecture
Requirements
- API endpoint should be available on Azure cloud
- Endpoint should receive survey data and a template name
- Return pdf document with questions asked along with charts of responses
Working with pdf files and generating them dynamically on the server can be challenging. There are libraries like PDFKit (https://pdfkit.org) you can use but you would have to specifically tell it what to draw and where, much like the canvas api:
const pdfKitDoc = new PDFDocument()
questions.forEach((question, i) => {
pdfKitDoc
// what's the height of each question?
.text(question.name, 0, i * ?)
// draw charts and calculate these x and y values somehow
.moveTo(100, 150)
.lineTo(100, 250)
.lineTo(200, 250)
.fill('#FF3300')
})
This is not a fun way to build charts.
Instead, I opted to use React as a templating engine to render static html. Using React, it's easy to make changes to styling such as margins, paddings, texts etc and we don't have to worry about positioning and flowing of text. We also get the benefit of the huge ecosystem which includes fantastic libraries for building charts.
Templates can now look like this:
const Template = ({ questions }) => (
<Layout>
{questions.map(question => {
const { type, data } = question.chart
return (
<Question key={question.id}>
<QuestionHeader title={question.name} />
<Chart type={type} data={data} />
</Question>
)
})}
</Layout>
)
One limitation is we cannot use canvas to draw the charts since it's dependent on a DOM environment running javascript. We only get to render static html with this approach. Luckily Nivo (https://nivo.rocks) provides beautiful charts with SVG support.
Note: It might be possible to use something like jsdom to work around this but that seems unnecessarily complex. Besides, SVG is great for drawing charts.
To render these templates we use React.renderToStaticMarkup
:
export function renderTemplate({ data, language, title }) {
return ReactDOMServer.renderToStaticMarkup(
React.createElement(Template, { data, language, title })
)
}
We now need to convert this html page into a pdf file. For this we can use Google Puppeteer.
Generating pdf with Puppeteer
Puppeteer is a headless Chrome browser which can be told to visit sites and get data from the DOM, commonly used as a scraper or running end-to-end tests. It can also be used to create pdf files.
It works like this:
import puppeteer from 'puppeteer'
export async function renderPDF(html: string) {
const browser = await puppeteer.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox']
})
const page = await browser.newPage()
// pass the html string as data text/html so we don't have to visit a url
await page.goto(`data text/html,${html}`, { waitUntil: 'networkidle0' })
const pdf = await page.pdf({ format: 'A4' })
await browser.close()
return pdf
}
Sometimes (quite often), things don't go as smoothly as expected. Turns out Google Puppeteer has a bug causing an empty pdf to be rendered if any hex colors are used in SVG. To solve this I replaced all occurences of hex colors with rgb values in the html using a regex.
// https://github.com/sindresorhus/hex-rgb
import hexRgb from 'hex-rgb'
export function hexToRgb(str: string) {
const hexTest = /#[a-f\d]{3,6}/gim
return str.replace(hexTest, hexColor => {
const { red, green, blue } = hexRgb(hexColor)
return `rgb(${red}, ${green}, ${blue})`
})
}
Mapping data
Each question can be configured to accept different types of answers. These types are:
- binary for yes/no
- single choice
- multi choice
- range choice
- text for comments
These types need to be represented differently in the report both in terms of chart type but also depending on the template if it should show data over a period of time or an aggregated summary.
// Questions have different answer types and should use different types of charts depending on template
const chartMappers = {
scale: {
summary: (responses) => createGroupedBar(responses),
periodic: (responses) => createPeriodicLine(responses)
},
single: {...},
multi: {...},
scale: {...},
text: {...}
}
const templateMappers = {
summary: periods => mergePeriods(periods),
periodic: periods => flattenPeriods(periods)
}
export function mapSurveyToCharts({ survey, template }) {
return {
questions: survey.questions.map(question => {
const responses = tempateMappers[template](question.periods)
const chart = chartMappers[question.Type][template](responses)
return {
name: question.Title,
chart: chart
}
})
}
}
Wrapping it up
We now have all the pieces we need and just have to put everything together:
export async function generateReport({ survey, template, language = 'en_US' }) {
const data = mapSurveyToCharts({ survey, template })
const html = renderTemplate({ data, language })
/*
Puppeteer is having issues with rendering SVGs with hex colors. Replace all with rgb(R, G, B).
https://github.com/GoogleChrome/puppeteer/issues/2556
*/
const replacedHTML = hexToRgb(html)
const pdf = await renderPDF(replacedHTML)
return pdf
}
Have you solved this in another way? Something that doesn't make sense? Would love to hear your thoughts and feedback!
Top comments (2)
Hi Carl this is an excellent approach, i am having the same need of creating a report control interface to download multiple charts,some pages in a web app as a PDF. But in my case i have a download button in the UI to trigger the PDF generation. Can you kindly point me to direction how you triggered generateReport function?
Hi Carl, it would be really helpful if you could share your project setup. How are you handling jsx import and all.