Is there anyone who hasn't heard of Puppeteer? It's a powerful browser automation tool that can simulate human interaction with web pages to achieve various complex requirements. For example: taking website screenshots, generating PDFs, automated testing, website scraping, and monitoring web content changes.
In many cases, we need to run Puppeteer online. For instance:
- In a CI/CD pipeline, calling an online Puppeteer to execute automated tests.
- Using cron to regularly check website availability.
- Running large-scale, distributed web crawlers.
Since Puppeteer tasks are usually short-lived and not triggered continuously, hosting it on a full server (like DigitalOcean) is not cost-effective because the server is billed every moment, regardless of whether Puppeteer is running.
The ideal approach is to deploy Puppeteer using a serverless model. Since serverless services charge based on actual invocations, it is usually much cheaper in various scenarios.
Currently, only a few platforms support running Puppeteer in a serverless manner: Leapcell, AWS Lambda, and Cloudflare Browser Rendering.
This article will explore these platforms: how to use them to complete a typical Puppeteer task, and their pros and cons.
The Task
We'll use a common Puppeteer use case as an example: capturing a screenshot of a web page.
The task involves the following steps:
- Visit a specified URL
- Take a screenshot of the page
- Return the image
Leapcell
Code Example:
const puppeteer = require('puppeteer');
const { Hono } = require('hono');
const { serve } = require('@hono/node-server');
const screenshot = async (url) => {
const browser = await puppeteer.launch({ args: ['--single-process', '--no-sandbox'] });
const page = await browser.newPage();
await page.goto(url);
const img = await page.screenshot();
await browser.close();
return img;
};
const app = new Hono();
app.get('/', async (c) => {
const url = c.req.query('url');
if (url) {
const img = await screenshot(url);
return c.body(img, { headers: { 'Content-Type': 'image/png' } });
} else {
return c.text('Please add a ?url=https://example.com/ parameter');
}
});
const port = 8080;
serve({ fetch: app.fetch, port }).on('listening', () => {
console.log(`Server is running on port ${port}`);
});
Since Leapcell supports deployments in various languages, this requirement is easily met.
Local Development and Debugging
Local debugging is very straightforward. Just like any other Node.js application: node index.js
, and you're done!
Deployment
Deployment requires specifying the build command, run command, and service port (as shown in the screenshot below).
Specific deployment parameters and procedures are detailed in the official documentation.
Once deployed, your application is available online.
Summary
✅ Pros:
- Consistent local and cloud environments, making debugging easier.
- Supports the official Puppeteer library.
❌ Cons:
- Slightly more complex setup: you have to write your own HTTP handler.
AWS Lambda
Code Example:
const chromium = require('chrome-aws-lambda');
exports.handler = async (event) => {
let browser = null;
try {
browser = await chromium.puppeteer.launch({
args: chromium.args,
defaultViewport: chromium.defaultViewport,
executablePath: await chromium.executablePath,
headless: chromium.headless,
});
const page = await browser.newPage();
await page.goto(event.url);
const screenshot = await page.screenshot();
return {
statusCode: 200,
headers: { 'Content-Type': 'image/jpeg' },
body: screenshot.toString('base64'),
isBase64Encoded: true,
};
} catch (error) {
return {
statusCode: 500,
body: 'Failed to capture screenshot.',
};
} finally {
if (browser !== null) {
await browser.close();
}
}
};
AWS Lambda requires using puppeteer-core
along with a third-party Chromium library, such as alixaxel/chrome-aws-lambda.
This is because AWS imposes a 250MB limit on the size of Lambda functions. The Chromium bundled with Puppeteer can easily exceed this limit (around 170MB on macOS, 282MB on Linux, and 280MB on Windows), so a slimmed-down version of Chromium must be used.
Local Development and Debugging
Local debugging requires complex configuration due to differences in the runtime environment, as you can see in the documentation for alixaxel/chrome-aws-lambda.
Deployment
To deploy, you need to upload your node_modules
as a ZIP file. Depending on your use case, you might also need to configure Lambda Layers. The main business logic can be written directly in the AWS console, and it can be executed after saving.
Summary
✅ Pros:
- The implementation code is relatively simpler.
❌ Cons:
- Relies on a third-party Chromium library, which can introduce potential risks.
- Complex local debugging.
- The deployment process is cumbersome, requiring packing and uploading a ZIP file, and possibly Lambda Layers.
Cloudflare Browser Rendering
Code Example:
import puppeteer from '@cloudflare/puppeteer';
export default {
async fetch(request, env) {
const { searchParams } = new URL(request.url);
let url = searchParams.get('url');
if (url) {
url = new URL(url).toString(); // normalize
const browser = await puppeteer.launch(env.MYBROWSER);
const page = await browser.newPage();
await page.goto(url);
const img = await page.screenshot();
await browser.close();
return new Response(img, {
headers: {
'content-type': 'image/png',
},
});
} else {
return new Response('Please add a ?url=https://example.com/ parameter');
}
},
};
Cloudflare Browser Rendering is a relatively new serverless Puppeteer solution. Similar to AWS Lambda, it does not support the official Puppeteer library. Instead, it uses a version of Puppeteer provided by Cloudflare.
While Cloudflare's library is more secure than any third-party option, its slow update cycle can be frustrating—for example, it once went more than five months without an update!
Additionally, Cloudflare Browser Rendering has some limitations:
- Only available for Worker Pro users.
- Each Cloudflare account can create a maximum of 2 browsers per minute, with no more than 2 browsers running concurrently.
Local Development and Debugging
Local debugging requires complex configuration.
Deployment
To deploy, simply write the function online, save, and run.
Summary
✅ Pros:
- The implementation code is relatively simpler.
❌ Cons:
- Depends on Cloudflare's Puppeteer library, which has an unstable update cycle.
- Complex local debugging.
- There's a paywall and other restrictions, preventing flexible use.
Conclusion
This article compared three major serverless platforms for deploying Puppeteer: Leapcell, AWS Lambda, and Cloudflare Browser Rendering. Each platform has its own advantages and disadvantages.
However, all things considered, if you plan to deploy your Puppeteer project online, Leapcell is an excellent choice.
Follow us on X: @LeapcellHQ
Related Posts:
Top comments (0)