Stop Self-Hosting Headless Chrome Just to Take a Screenshot

#javascript #webdev #devops #api

Every few months a teammate opens a pull request titled something like "add OG image generation," and every few months it turns into a saga. The feature itself is one line of intent: render a webpage to an image. The implementation is where the weekend goes.

If you have ever shipped screenshots, PDFs, or Open Graph images from a real production environment, you already know the shape of this problem. Let me walk through why it is harder than it looks, and a pattern that keeps it boring.

The trap of "just use Puppeteer"

The first version is always clean. You install Puppeteer, launch a headless browser, navigate to a URL, and capture the buffer.

import puppeteer from "puppeteer";

const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("https://example.com", { waitUntil: "networkidle0" });
const image = await page.screenshot({ type: "png" });
await browser.close();

On your laptop this is flawless. Then you deploy it.

In a serverless function, the Chromium binary blows past your bundle size limit, so you reach for a slimmed build. Cold starts now add seconds because the browser has to boot. Under any real concurrency you hit the memory ceiling and functions start getting killed. Custom fonts on the target page do not render unless you ship the font files too. And the next time a dependency bump touches the Chromium version, something silently breaks.

None of this is your product. It is infrastructure you did not want to own, sitting on the critical path of a feature your users barely think about.

Treat rendering as an HTTP call

The pattern that has saved me the most time is simple: do not run the browser yourself. Make rendering a stateless HTTP request. You send a URL and a format, you get back bytes.

const res = await fetch("https://api.captureapi.dev/v1/screenshot", {
  method: "POST",
  headers: {
    "Authorization": `Bearer ${process.env.CAPTURE_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    url: "https://example.com",
    format: "png",
  }),
});

const image = await res.arrayBuffer();

No browser in your bundle. No memory tuning. No cold-start penalty from booting Chromium. The render layer becomes a dependency you call, not a service you babysit.

The three outputs you actually need

In practice, most teams need the same three things, and they need them from one place rather than three separate hacks:

Screenshots for previews, thumbnails, monitoring, and dashboards.
PDFs for invoices, reports, and exportable documents, generated from the same rendered page.
Open Graph images so links to your app look right when shared on social platforms.

Wiring up three different home-grown solutions for these is how you end up with three different sets of Chromium bugs. One endpoint that switches on a format parameter keeps the surface area small.

Two things that matter at scale

When you go from one screenshot to thousands, two features stop being nice-to-haves.

Batch processing. When you need to regenerate every OG image after a template change, or screenshot a list of pages on a schedule, firing one request per URL and managing the concurrency yourself reintroduces the exact problem you were trying to escape. Sending a list of URLs in a single batch request pushes that concurrency management to the service.

Edge caching. A huge share of render requests are repeats. The same blog post, the same product page, the same OG image requested again and again. If every one of those re-renders a full browser page, you are paying for work you already did. Caching the rendered result at the edge means repeat requests come back fast without re-rendering.

Where this leaves you

The honest takeaway is not that rendering is impossible to self-host. Plenty of teams do it. It is that the time you spend on Chromium binaries, memory limits, font handling, and cache invalidation is time you are not spending on the thing you are actually building.

If you would rather make rendering a single HTTP call and move on, CaptureAPI handles screenshots, PDFs, and Open Graph images from one endpoint, with batch processing and edge caching built in. It is free to try, so you can drop it into a function and see whether "rendering as an HTTP call" feels as boring as it should: https://captureapi.dev

Full disclosure: I build CaptureAPI, a rendering API for website screenshots, PDFs, and Open Graph images. It is free to try at https://captureapi.dev.