DEV Community

Cover image for Building a screenshot microservice
Bryan Ollendyke
Bryan Ollendyke

Posted on

Building a screenshot microservice

Let's step the complexity up a bit. We've covered three very simple, straight forward transactions where the relationship is I give you info, you give me response.

This use-case / requirement was a lot trickier than others though (a bit unexpectedly so) which is building a service that is capable of remote rendering a web page. I've experimented with this concept before since it's popular among news and media outlets to screenshot things off of twitter and elsewhere to illustrate a change in post, deletion, etc.

Steps in thinking along the way toward much success

"I want to take screenshots" I said. Then I thought..

  • What on NPM allows me to do this
  • "omg how big is puppeteer again?!"
  • "wow this works fantastic locally"
  • "wow this works not at all in production"
  • makes new repo
  • Ahh, there it goes, working great in production!

So let's unpack these problems and solutions a bit.

What is puppeteer and why do I need to become one?

Puppeteer is effectively a headless browser runner. As they state "Most things that you can do manually in the browser can be done using Puppeteer!"

Think of if you did a series of tasks from opening a browser, typing the URL, waiting for it to load, scrolling and finding something, that puppeteer is able to be instructed what to do in what order to achieve the same thing. The website has lots of common use-cases for it, among them being that it's popular for testing environments and noticing CSS / layout change in Pull requests.

Our requirement was to take screenshots of URLs so that we could provide a preview, effectively indicating that a web site is done being built. To be clear, as of the time of writing this has not been wired into our production app as it's not slated to be added for a bit, but being able to solve this problem via Vercel helped demonstrate capability to the team.

Code me

Using approaches similar to the past three endpoints, the screenshot service looks like this:

import { getBrowserInstance } from '../getBrowserInstance.js';
import { stdResponse, invalidRequest, stdPostBody } from "../requestHelpers.js";

// this requires its own service instance and can't live with the monorepo
// due to the size of the dependencies involved
export default async function handler(req, res) {
  const body = stdPostBody(req);
  const urlToCapture = body.urlToCapture;
  // Perform URL validation
  if (!urlToCapture || !urlToCapture.trim()) {
    res = invalidRequest(res, 'enter a valid url');
  else {
    if (!urlToCapture.includes("https://")) {
      // try to fake it
      urlToCapture = `https://${urlToCapture}`;

    // capture options
    var browserGoToOptions = {
      timeout: 60000,
      waitUntil: 'networkidle2',
    var screenshotOptions = {
      quality: body.quality ? parseInt(body.quality) : 75,
      type: 'jpeg',
      encoding: "base64"
    var base64 = '';
    let browser = null
    try {
      browser = await getBrowserInstance();
      let page = await browser.newPage();
      await page.goto(urlToCapture, browserGoToOptions);
      // special support for isolating a tweet
      if (urlToCapture.includes('')) {
        await page.waitForSelector("article[data-testid='tweet']");
        const element = await page.$("article[data-testid='tweet']");
        base64 = await element.screenshot(screenshotOptions);
      else {
        screenshotOptions.fullPage = true;
        base64 = await page.screenshot(screenshotOptions);
      res = stdResponse(res,
          url: urlToCapture,
          image: base64
        }, {
          cache: 1800
    } catch (error) {
        res = invalidRequest(res, 'something went wrong', 500);
    } finally {
        if (browser !== null) {
            await browser.close()
Enter fullscreen mode Exit fullscreen mode

The "magic" here is rolled up in something called getBrowserInstance, seen below:

import chromium from 'chrome-aws-lambda'

export async function getBrowserInstance() {
    const executablePath = await chromium.executablePath
    if (!executablePath) {
        // running locally
        const puppeteer = await import('puppeteer').then((m) => {
      return m.default;
        return await puppeteer.launch({
            args: chromium.args,
            headless: true,
            defaultViewport: {
                width: 1280,
                height: 720
            ignoreHTTPSErrors: true

    return await chromium.puppeteer.launch({
    args: chromium.args,
    defaultViewport: chromium.defaultViewport,
    executablePath: executablePath,
        headless: chromium.headless,
        ignoreHTTPSErrors: true
Enter fullscreen mode Exit fullscreen mode

This simple function helps to rectify the difference between vercel in production (aka Lambda driven calls) vs local vercel dev calls (which leverage the local copy of Chromium to do rendering).

Issues encountered

When in production, Vercel only allows so much code to be compiled to run. Because of the size of the packages involved, the screenshot service had to be pealed out of our monorepo and run stand alone. This was referenced in a previous post, but lead to the definition in our middleware to look like this:

// screenshot - kept by itself bc of size of getBrowserInstance
    endpoint: "",
    name: "@core/screenshotUrl",
    title: "Screenshot page",
    description: "Takes screenshot of a URL and returns image",
    params: {
      urlToCapture: "full url with https",
      quality: "Optional image quality parameter"
Enter fullscreen mode Exit fullscreen mode

This maps the @core/screenshotUrl call to a very specific address. I'm not a massive fan of this solution as I'd like to be something more lined up with the rest of our URL structure but it's not the end of the world.


Here's a demonstration of the screenshot tool while talking through how the code is able to process the remote URL and even has support for isolating tweets on twitter to show some increased complexity 💪.

Top comments (0)