This post was originally posted on my blog
Intro
Recently I had to create two serverless functions for a client that needed to create a PDF document from an existing HTML format and merge it with another PDF documents provided by users in an upload form.
In this article, we will use examples based on real-world applications.
Going through project configuration, AWS configuration, and project deployment.
Content
- Setting Up
- Setting up serverless configuration
- Setting up a Lambda Layer
- Working with Puppeteer
- Uploading PDF to S3
- Deploying to AWS
TL;DR:
- Lambda function Github Repo
- Login demo app Github Repo
Setting Up
Serverless Framework
We will be using the Serverless Framework to deploy easily our resources to the cloud.
Open up a terminal and type the following command to install Serverless globally using npm.
npm install -g serverless
Initial Project Setup
Create a new serverless project:
serverless create --template aws-nodejs --path pdf-generator
This is going to create a new folder named pdf-generator
with two files on it handler.js
and serverless.yml
. For now, we will leave the files as-is.
Installing Dependencies.
We will need the following dependencies to work with puppeteer on our project.
- chrome-aws-lambda: Chromium Binary for AWS Lambda and Google Cloud Functions.
- puppeteer-core: Puppeteer-core is intended to be a lightweight version of Puppeteer for launching an existing browser installation or for connecting to a remote one.
- aws-sdk: AWS SDK Library to interact with AWS Services.
- serverless-webpack: A Serverless v1.x & v2.x plugin to build your lambda functions with Webpack.
- node-loader: Allows to connect native node modules with .node extension.
npm install chrome-aws-lambda puppeteer-core
npm install -D aws-sdk node-loader serverless-webpack
Configuring Webpack
Once we have our project dependencies installed, we are going to configure Webpack, to package our code and reduce the size of our cloud function, this will save us a lot of problems since lambdas can hit around 1GB of space, and sometimes AWS rejects our package because of the size.
Create the file webpack.config.js
on our project root, and add the following code:
module.exports = {
target: "node",
mode: "development",
module: {
rules: [
{
test: /\.node$/,
loader: "node-loader",
},
],
},
externals: ["aws-sdk", "chrome-aws-lambda"],
};
In the code above we are setting the following options to Webpack:
- We are using development mode, so our code isn't minified and we can trace errors with
AWS CloudWatch
- We are importing node modules to our bundle using
node-loader
- We are excluding
aws-sdk
andchrome-aws-lambda
from our bundle since AWS has a built-inaws-sdk
library and forchrome-aws-lambda
we are going to use a Lambda Layer since Webpack can't bundle the library as-is
Setting up serverless configuration
Next, we are going to configure our serverless.yml
file, for now, we will add some environment variables, a lambda layer to use chrome-aws-lambda
, and add Webpack to the list of plugins.
First, we define global variables to use along all of our functions.
custom:
app_url: https://puppeteer-login-demo.vercel.app
app_user: admin@admin.com
app_pass: 123456789
Here we are defining custom properties that we can access in our configuration file using the syntax ${self:someProperty}
in our case, we can access our properties using the following syntax ${self:custom.someProperty}
Now we define our environment variables inside our function to allow our handler to access these variables.
functions:
generate-pdf:
handler: handler.handler
environment:
APP_URL: ${self:custom.app_url}
APP_USER: ${self:custom.app_user}
APP_PASS: ${self:custom.app_pass}
Now add the plugins section at the end of our file, so we can use Webpack with our lambdas.
plugins:
- serverless-webpack
package:
individually: true
So far our serverless.yml
should look like the following:
service: pdf-generator
frameworkVersion: '2'
custom:
app_url: https://puppeteer-login-demo.vercel.app
app_user: admin@admin.com
app_pass: 123456789
provider:
name: aws
stage: dev
region: us-east-1
runtime: nodejs12.x
lambdaHashingVersion: 20201221
functions:
generate-pdf:
handler: handler.handler
environment:
APP_URL: ${self:custom.app_url}
APP_USER: ${self:custom.app_user}
APP_PASS: ${self:custom.app_pass}
plugins:
- serverless-webpack
package:
individually: true
Setting up a Lambda Layer
To use the library chrome-aws-lambda
we need to use it as an external library, for this, we can create our own Lambda Layer or use a community hosted one.
Here I'll explain both options and you can decide whenever option you want to use it.
Own Hosted Layer
First, we have to package the library as a zip file, open up the terminal, and type:
git clone --depth=1 https://github.com/alixaxel/chrome-aws-lambda.git && \
cd chrome-aws-lambda && \
make chrome_aws_lambda.zip
The above will create a chrome-aws-lambda.zip
file, which can be uploaded to your Layers console.
Community Hosted Layer
This repository hosts a Community Lambda Layer so we can use it directly on our function. At this time the latest version is 24
arn:aws:lambda:us-east-1:764866452798:layer:chrome-aws-lambda:24
Now we have to add this layer to our serverless.yml
file and specify that our function is going to use this layer, in this case, we are going to use the community version.
functions:
generate-pdf:
handler: handler.handler
layers:
- arn:aws:lambda:us-east-1:764866452798:layer:chrome-aws-lambda:24
Working with Puppeteer
Now that our project is configured, we are ready to start developing our lambda function.
First, we start loading the chromium library and creating a new instance in our handler.js
file to work with Puppeteer.
"use strict";
const chromium = require("chrome-aws-lambda");
exports.handler = async (event, context) => {
let browser = null;
try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
ignoreHTTPSErrors: true,
});
const page = await browser.newPage();
} catch (e) {
console.log(e);
} finally {
if (browser !== null) {
await browser.close();
}
}
};
In this example, we will use an app that needs login to view the report that we want to convert to PDF, so first, we are going to navigate to the login page and using the environment variables to simulate a login to access the report.
await page.goto(`${process.env.APP_URL}/login`, {
waitUntil: "networkidle0",
});
await page.type("#email", process.env.APP_USER);
await page.type("#password", process.env.APP_PASS);
await page.click("#loginButton");
await page.waitForNavigation({ waitUntil: "networkidle0" });
In the above code we carry out the following steps:
- Navigate to the login page
- Search for the input with ID
email
andpassword
and type the user and password credentials from the env variables. - Click on the button with ID
loginButton
- Wait for the next page to be fully loaded (in our example we are being redirected to a Dashboard)
Now we are logged in, so we can navigate to the report URL that we want to convert to a PDF file.
await page.goto(`${process.env.APP_URL}/invoice`, {
waitUntil: ["domcontentloaded", "networkidle0"],
});
Here we go to the invoice
page and wait until the content is fully loaded.
Now that we are on the page that we want to convert, we create our PDF file and save it on the buffer
to save it later to AWS S3.
const buffer = await page.pdf({
format: "letter",
printBackground: true,
margin: "0.5cm",
});
in the above code we added a few options to the pdf
method:
- format: the size of our file
- printBackground: print background graphics
- margin: add a margin of 0.5cm to the print area
So far our handler.js
should look like this:
"use strict";
const chromium = require("chrome-aws-lambda");
exports.handler = async (event, context) => {
let browser = null;
try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
ignoreHTTPSErrors: true,
});
const page = await browser.newPage();
await page.goto(`${process.env.APP_URL}/login`, {
waitUntil: "networkidle0",
});
await page.type("#email", process.env.APP_USER);
await page.type("#password", process.env.APP_PASS);
await page.click("#loginButton");
await page.waitForNavigation({ waitUntil: "networkidle0" });
await page.goto(`${process.env.APP_URL}/invoice`, {
waitUntil: ["domcontentloaded", "networkidle0"],
});
const buffer = await page.pdf({
format: "letter",
printBackground: true,
margin: "0.5cm",
});
} catch (e) {
console.log(e);
} finally {
if (browser !== null) {
await browser.close();
}
}
};
Uploading PDF to S3
Currently, we can generate our PDF file using Puppeteer, now we are going to configure our function to create a new S3 Bucket, and upload our file to S3.
First, we are going to define in our serverless.yml
file, the resources for the creation and usage of our S3 bucket.
service: pdf-generator
frameworkVersion: '2'
custom:
app_url: https://puppeteer-login-demo.vercel.app
app_user: admin@admin.com
app_pass: 123456789
bucket: pdf-files
provider:
name: aws
stage: dev
region: us-east-1
iam:
role:
statements:
- Effect: Allow
Action:
- s3:PutObject
- s3:PutObjectAcl
Resource: "arn:aws:s3:::${self:custom.bucket}/*"
runtime: nodejs12.x
lambdaHashingVersion: 20201221
functions:
generate-pdf:
handler: handler.handler
timeout: 25
layers:
- arn:aws:lambda:us-east-1:764866452798:layer:chrome-aws-lambda:24
environment:
APP_URL: ${self:custom.app_url}
APP_USER: ${self:custom.app_user}
APP_PASS: ${self:custom.app_pass}
S3_BUCKET: ${self:custom.bucket}
plugins:
- serverless-webpack
package:
individually: true
resources:
Resources:
FilesBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: ${self:custom.bucket}
Here we defined our resource FilesBucket
that Serverless is going to create, and we also defined the permissions that our Lambda has over the Bucket, for now, we just need permission to put files.
Now in our handler.js
we load the AWS library and instance a new S3 object.
const AWS = require("aws-sdk");
const s3 = new AWS.S3({ apiVersion: "2006-03-01" });
Now, we just need to save our buffer
variable to our S3 Bucket.
const s3result = await s3
.upload({
Bucket: process.env.S3_BUCKET,
Key: `${Date.now()}.pdf`,
Body: buffer,
ContentType: "application/pdf",
ACL: "public-read",
})
.promise();
await page.close();
await browser.close();
return s3result.Location;
Here we uploaded our file to our Bucket, closed our chromium
session, and returned the new file URL.
Deploying to AWS
First, we need to add our AWS Credentials to Serverless in order to deploy our functions, please visit the serverless documentation to select the appropriate auth method for you.
Now, open the package.json
file to add our deployment commands.
"scripts": {
"deploy": "sls deploy",
"remove": "sls remove"
},
Here we added 2 new commands, deploy
and remove
, open up a terminal and type:
npm run deploy
Now our function is bundled and deployed to AWS Lambda!
Top comments (0)