DEV Community

Cover image for How to use Puppeteer inside a Docker container

How to use Puppeteer inside a Docker container

Axel Navarro on March 30, 2022

Introduction Puppeteer is a Node.js library which provides a high-level API to control Chromium (or Firefox) browsers over the DevTools ...
Collapse
 
bayokwendo profile image
Brian Kwendo • Edited

After running the code and I got into multiple errors of browser not launching caused by using wrong executablePath etc, Below code help in resolving the issue

            const executablePath: string = await new Promise(resolve => locateChrome((arg: any) => resolve(arg))) || '';

            const browser = await puppeteer.launch({
                executablePath,
                args: ['--no-sandbox', '--disable-setuid-sandbox'],

            });

Enter fullscreen mode Exit fullscreen mode
Collapse
 
mrgoonie profile image
Goon Nguyen

where is that locateChrome function tho?

Collapse
 
__38ab1d02e profile image
Евгений Косицын
Collapse
 
arielerv profile image
Ariel Erviti

Hi there, I know it's an old post, but it's still valid. I provide a config that works for image oraclelinux based on rh.

FROM oraclelinux:7-slim

RUN yum -y install oracle-nodejs-release-el7 oracle-instantclient-release-el7 wget unzip && \
    yum-config-manager --disable ol7_developer_nodejs\* && \
    yum-config-manager --enable ol7_developer_nodejs16 && \
    yum-config-manager --enable ol7_optional_latest && \
    yum -y install nodejs node-oracledb-node16 && \
    rm -rf /var/cache/yum/*

RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm && \
    yum install -y google-chrome-stable_current_x86_64.rpm

WORKDIR /srv/app/

COPY . /srv/app/.

RUN npm install

EXPOSE 3006

CMD ["node", "index.js"]
Enter fullscreen mode Exit fullscreen mode

And the lunch:

        const browser = await puppeteer.launch({
            executablePath: '/usr/bin/google-chrome',
            args: [
                '--disable-gpu',
                '--disable-dev-shm-usage',
                '--disable-setuid-sandbox',
                '--no-sandbox'
            ]
        });
Enter fullscreen mode Exit fullscreen mode
Collapse
 
mdrijwan profile image
Md Rijwan Razzaq Matin • Edited

Hi there,
I used your Dockerfile content along with mine as i am trying to to generate pdf file for this service that i'm building with typescript. now everything works locally but i can't deploy it to AWS as it exceeds the lambda limit. now i am trying to dockerize it and it get's deployed but throws the following error.

"Failed to launch the browser process! spawn /usr/bin/google-chrome ENOENT\n\n\nTROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md\n"
Enter fullscreen mode Exit fullscreen mode

here is my Dockerfile

FROM node:slim

# We don't need the standalone Chromium
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true

# Install Google Chrome Stable and fonts
# Note: this installs the necessary libs to make the browser work with Puppeteer.
RUN apt-get update && apt-get install gnupg wget -y && \
    wget --quiet --output-document=- https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/google-archive.gpg && \
    sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' && \
    apt-get update && \
    apt-get install google-chrome-stable -y --no-install-recommends && \
    rm -rf /var/lib/apt/lists/*

FROM public.ecr.aws/lambda/nodejs:14.2022.09.09.11
ARG FUNCTION_DIR="/var/task"

# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy package.json
COPY package.json ${FUNCTION_DIR}

# Install NPM dependencies for function
RUN npm install

# Copy handler function and tsconfig
COPY . ${FUNCTION_DIR}

# Compile ts files
RUN npm run build

# Set the CMD to your handler
CMD [ "dist/src/generate.pdf" ]
Enter fullscreen mode Exit fullscreen mode

and here is my code

export async function generatePdf(
  file: FileType,
  options?: OptionsProps,
  callback?: CallBackType
) {
  let args = ['--no-sandbox', '--disable-setuid-sandbox', '--disable-gpu'];

  if (options?.args) {
    args = options.args;
    delete options.args;
  }

  const browser = await puppeteer.launch({
    headless: false,
    args: args,
    ignoreDefaultArgs: ['--disable-extensions'],
    executablePath: '/usr/bin/google-chrome',
  });

  const page = await browser.newPage();
Enter fullscreen mode Exit fullscreen mode
Collapse
 
navarroaxel profile image
Axel Navarro • Edited

Hi! Here, you're using a multi-stage build in Docker. You are taken the node:slim image, installing puppeteer there. But then you started a new stage with FROM public.ecr.aws/lambda/nodejs:14 and you don't have apt or Chrome neither in this image because is based on Amazon Linux and it uses yum as package manager (like RHEL).

You can check some approaches like github.com/shelfio/chrome-aws-lamb... or github.com/alixaxel/chrome-aws-lambda that explains how to use pptr inside Lambdas.

Also, I found this here stackoverflow.com/a/66099373, but I didn't test it

FROM public.ecr.aws/lambda/nodejs:14

RUN yum install -y wget unzip libX11

RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm && \
    yum install -y google-chrome-stable_current_x86_64.rpm

RUN CHROME_DRIVER_VERSION=`curl -sS https://chromedriver.storage.googleapis.com/LATEST_RELEASE` && \
    wget -O /tmp/chromedriver.zip https://chromedriver.storage.googleapis.com/$CHROME_DRIVER_VERSION/chromedriver_linux64.zip && \
    unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/
Enter fullscreen mode Exit fullscreen mode
Collapse
 
mdrijwan profile image
Md Rijwan Razzaq Matin

i just tried this.

my Dockerfile

FROM public.ecr.aws/lambda/nodejs:14

RUN yum install -y wget unzip libX11

RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm && \
    yum install -y google-chrome-stable_current_x86_64.rpm

RUN CHROME_DRIVER_VERSION=`curl -sS https://chromedriver.storage.googleapis.com/LATEST_RELEASE` && \
    wget -O /tmp/chromedriver.zip https://chromedriver.storage.googleapis.com/$CHROME_DRIVER_VERSION/chromedriver_linux64.zip && \
    unzip /tmp/chromedriver.zip chromedriver -d /usr/local/bin/

ARG FUNCTION_DIR="/var/task"

# Create function directory
RUN mkdir -p ${FUNCTION_DIR}

# Copy package.json
COPY package.json ${FUNCTION_DIR}

# Install NPM dependencies for function
RUN npm install

# Copy handler function and tsconfig
COPY . ${FUNCTION_DIR}

# Compile ts files
RUN npm run build

# Set the CMD to your handler
CMD [ "dist/src/generate.pdf" ]
Enter fullscreen mode Exit fullscreen mode

and my code

export async function generatePdf(
  file: FileType,
  options?: OptionsProps,
  callback?: CallBackType
) {
  let args = [
    '--no-sandbox',
    '--disable-setuid-sandbox',
    '--disable-dev-shm-usage',
    '--disable-gpu',
  ];

  if (options?.args) {
    args = options.args;
    delete options.args;
  }

  const browser = await puppeteer.launch({
    args: args,
    executablePath: '/usr/bin/google-chrome',
  });
Enter fullscreen mode Exit fullscreen mode

and i get this error "Protocol error (Target.setAutoAttach): Target closed."

2022-09-15T05:46:28.039Z    a251301b-87b7-4e34-bf7c-c1d0a42ae6f5    ERROR   ProtocolError: Protocol error (Target.setAutoAttach): Target closed.
    at /var/task/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:104:24
    at new Promise (<anonymous>)
    at Connection.send (/var/task/node_modules/puppeteer/lib/cjs/puppeteer/common/Connection.js:100:16)
    at ChromeTargetManager.initialize (/var/task/node_modules/puppeteer/lib/cjs/puppeteer/common/ChromeTargetManager.js:253:82)
    at Browser._attach (/var/task/node_modules/puppeteer/lib/cjs/puppeteer/common/Browser.js:219:73)
    at Function._create (/var/task/node_modules/puppeteer/lib/cjs/puppeteer/common/Browser.js:201:23)
    at ChromeLauncher.launch (/var/task/node_modules/puppeteer/lib/cjs/puppeteer/node/ChromeLauncher.js:92:50)
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
    at async generatePdf (/var/task/dist/src/helpers/makePdf.js:21:21)
    at async Runtime.pdf [as handler] (/var/task/dist/src/generate.js:21:29) {
  originalMessage: ''
}
Enter fullscreen mode Exit fullscreen mode
Thread Thread
 
chobotx profile image
ChobotX

Any solution to this? Having the exact same error.

Thread Thread
 
navarroaxel profile image
Axel Navarro

You should install all these X Window System dependencies in your Docker image:
alsa-lib
atk
cups-libs
ipa-gothic-fonts
libXcomposite
libXcursor
libXdamage
libXext
libXi
libXrandr
libXScrnSaver
libXtst
pango
xorg-x11-fonts-100dpi
xorg-x11-fonts-75dpi
xorg-x11-fonts-cyrillic
xorg-x11-fonts-misc
xorg-x11-fonts-Type1
xorg-x11-utils

Collapse
 
mdrijwan profile image
Md Rijwan Razzaq Matin

Also, would have a look in here please? I'm so stuck!
stackoverflow.com/questions/737184...

Thread Thread
 
mdrijwan profile image
Md Rijwan Razzaq Matin

updated my Dockerfile
i'm using your build and copying to my own build

ARG FUNCTION_DIR="/function"
FROM node:slim as build-image
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true
ARG FUNCTION_DIR
RUN apt-get update && apt-get install gnupg wget -y && \
    wget --quiet --output-document=- https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/google-archive.gpg && \
    sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' && \
    apt-get update && \
    apt-get install google-chrome-stable -y --no-install-recommends && \
    rm -rf /var/lib/apt/lists/*
RUN mkdir -p ${FUNCTION_DIR}/
COPY . ${FUNCTION_DIR}
RUN ls ${FUNCTION_DIR}
WORKDIR ${FUNCTION_DIR}
RUN npm install

FROM public.ecr.aws/lambda/nodejs:latest
ARG FUNCTION_DIR
WORKDIR ${FUNCTION_DIR}
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}
RUN ls ${FUNCTION_DIR}
COPY package.json ${FUNCTION_DIR}
RUN npm install
COPY . ${FUNCTION_DIR}

RUN npm run build
RUN ls ${FUNCTION_DIR}/node_modules
RUN node node_modules/puppeteer/install.js
CMD [ "/function/dist/api/generate.pdf" ]
Enter fullscreen mode Exit fullscreen mode

but getting this error:
"Failed to launch the browser process! spawn /usr/bin/google-chrome ENOENT\n\n\nTROUBLESHOOTING: https://github.com/puppeteer/puppeteer/blob/main/docs/troubleshooting.md\n"

Thread Thread
 
navarroaxel profile image
Axel Navarro

Why aren't you using the Amazon ECS?

Thread Thread
 
mdrijwan profile image
Md Rijwan Razzaq Matin

why? so it would workout with ECS but not with lambda?

Thread Thread
 
navarroaxel profile image
Axel Navarro

ECS was made to work with Docker images and it fits with your needs. You can run Docker images on Lambdas but I didn't see that before, maybe you will pay more if you use Lambda because the price is based on CPU and RAM consumption.
The resources required to wake up a container are much more than just running and interpreting some code, so you could end up paying more just for resources that doesn't necessarily translate to performance.

Collapse
 
evanstjabadi profile image
Evans Tjabadi

@navarroaxel - nice article - helped me even though I am using the python version

Collapse
 
sharmaarun profile image
Arun Sharma

Thanks! It works fine for me.

Collapse
 
cory_k profile image
Cory

Thank you for this, it got me unstuck. Do you know if installing the latest Google Chrome could lead to problems if using an older version of Puppeteer? How to avoid this?

Collapse
 
navarroaxel profile image
Axel Navarro

Yep, puppeteer is tested with a specific version of Chromium, details here: pptr.dev/chromium-support.

Also in each release you can see that version: github.com/puppeteer/puppeteer/rel....

You can check the available version starting a container with:

docker run --rm -it node:18-slim bash
Enter fullscreen mode Exit fullscreen mode

And then these commands for chromium or google-chrome-stable:

$ apt-get update && apt list --all-versions chromium

# Add the apt repo for Google Chrome
$ apt update && apt install curl gnupg -y \
  && curl --location --silent https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
  && sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
  && apt update

$ apt list --all-versions google-chrome-stable
Enter fullscreen mode Exit fullscreen mode

For node:18-slim you'll see this output:

google-chrome-stable/stable 104.0.5112.101-1 amd64
chromium/stable-security 104.0.5112.101-1~deb11u1 amd64
chromium/stable 103.0.5060.53-1~deb11u1 amd64
Enter fullscreen mode Exit fullscreen mode

Just look for a puppeteer version that works fine with the given Chromium version.

Collapse
 
wooncherk profile image
Lam Woon Cherk

This saved the day! :)

Collapse
 
gelbotron profile image
Andrej Hýll

I made an account just to thank You.

I have been trying to run @unlighthouse and Puppeteer in a docker for two days now, with nothing working. I installed Chrome in twenty different ways with no luck; the puppeteer couldn't spawn Chrome. Miraculously this exact snipped worked! The dopamine rush and ecstasy were something indescribable.

Collapse
 
greggcbs profile image
GreggHume

how ridiculously hard is it to run puppeteer on a server. my mind is bending right now.

Collapse
 
brandoncopley profile image
Brandon Copley • Edited

SOLVED If you're on an M1 mac you have to add --platform linux/amd64 to your docker build command.

docker build --platform linux/amd64 . -t your_image_name
Enter fullscreen mode Exit fullscreen mode

When I run this same docker file I receive the following error:

#5 6.185 E: Unable to locate package google-chrome-stable
------
executor failed running [/bin/sh -c apt-get update && apt-get install gnupg wget -y &&   wget --quiet --output-document=- https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/google-archive.gpg &&   sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' &&   apt-get update &&   apt-get install google-chrome-stable -y --no-install-recommends &&   rm -rf /var/lib/apt/lists/*]: exit code: 100
Enter fullscreen mode Exit fullscreen mode

The entire dockerfile is:

FROM node:slim

# We don't need the standalone Chromium
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true

# Install Google Chrome Stable and fonts
# Note: this installs the necessary libs to make the browser work with Puppeteer.
RUN apt-get update && apt-get install gnupg wget -y && \
  wget --quiet --output-document=- https://dl-ssl.google.com/linux/linux_signing_key.pub | gpg --dearmor > /etc/apt/trusted.gpg.d/google-archive.gpg && \
  sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' && \
  apt-get update && \
  apt-get install google-chrome-stable -y --no-install-recommends && \
  rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode
Collapse
 
kamalkech profile image
kamal

are u fixed this issue ??

Collapse
 
jijonivan profile image
Ivan Jijon • Edited

Wonderful, thanks!
I had to add ENV PUPPETEER_SKIP_DOWNLOAD true to the env variables.
ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true wasn't enough.

Collapse
 
mail2viveksagar profile image
Vivek

Getting this error - ****
Error: Failed to launch the browser process!
[74:122:0316/214552.693705:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /run/dbus/system_bus_socket: No such file or directory
[0316/214552.821672:ERROR:scoped_ptrace_attach.cc(27)] ptrace: Function not implemented (38)
Assertion failed: p_rcu_reader->depth != 0 (/qemu/include/qemu/rcu.h: rcu_read_unlock: 102)

TROUBLESHOOTING: pptr.dev/troubleshooting

at ChildProcess.onClose (/work/node_modules/@puppeteer/browsers/lib/cjs/launch.js:277:24)
at ChildProcess.emit (node:events:530:35)
at ChildProcess._handle.onexit (node:internal/child_process:294:12)
{"success":false,"data":"{\"status\":\"\",\"headers\":[],\"content\":\"\",\"trace\":\"PUPPET_LOG: INPUT_JSON = {\"url\":\"example.com\",\"user_agent\"...} StartLoading > ERROR > Cannot read properties of undefined (reading 'newPage')\"}"}TypeError: Cannot read properties of undefined (reading 'close')
at closeBrowser (/work/download_page_html.js:330:24)
at killProcess (/work/download_page_html.js:345:8)
at outputResult (/work/download_page_html.js:378:3)
at /work/download_page_html.js:367:4
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

Collapse
 
mail2viveksagar profile image
Vivek

My docker file looks like - FROM --platform=linux/amd64 node:20

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD true
ENV PUPPETEER_SKIP_DOWNLOAD true

RUN apt-get update && apt-get install curl gnupg -y \
&& curl --location --silent dl-ssl.google.com/linux/linux_sign... | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install google-chrome-stable -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /work

COPY package.json ./

RUN npm install

RUN npx puppeteer browsers install chrome

COPY app.js download_page_html.js crawler-browser.js start-crawler-browser.sh start-download-page-html.sh ./

EXPOSE 3000

And I am using Apple M1 laptop

Collapse
 
brassel profile image
Captain Fim • Edited

When I try to use the docker file above to build an image, I get

#5 7.862 E: Unable to locate package google-chrome-stable

Collapse
 
navarroaxel profile image
Axel Navarro • Edited

Works for me using this Dockerfile gist.github.com/navarroaxel/3f4492..., I built it with

docker build --tag node-chrome .
Enter fullscreen mode Exit fullscreen mode

Are you using another base image instead of FROM node:slim AS app?

Collapse
 
rajeshpal53 profile image
Rajesh Pal

As you working on puppeteer, and if you suffer from zombie process then use below docker commands , it will not create zombie process

FROM node:18-slim

RUN apt-get update
RUN apt-get upgrade

RUN apt-get update && apt-get install curl gnupg -y \
&& curl --location --silent dl-ssl.google.com/linux/linux_sign... | apt-key add - \
&& sh -c 'echo "deb [arch=amd64] dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list' \
&& apt-get update \
&& apt-get install google-chrome-stable -y --no-install-recommends \
&& rm -rf /var/lib/apt/lists/*

RUN apt-get update && \
apt-get upgrade && apt-get install -y vim

ADD ./puppetron.tar /usr/share/
WORKDIR /usr/share/puppetron

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV SERVICE_PATH=/usr/share/puppetron

CMD node main.js;

And path of browser

executablePath: '/usr/bin/google-chrome',

Collapse
 
septianmuh profile image
septianmuh • Edited

this tutorial is great. but i have an issue, when i try to install chromium in node:18-alpine3.16, i add some command like this : RUN apk add --no-cache chromium
but it still not work, chromium not installed in container. any one has tutorials or some advice?

Collapse
 
jhoanlt profile image
Jhoan López

Thank you, after so much searching I found the correct solution.

Collapse
 
illestomas profile image
Illés Tamás

what is your solution, can you post ?

Collapse
 
__e135a9d50c860 profile image
Лев Хотылев

Can anyone give dockerfile to work with puppeteer under python? I can't find one working correctly anywhere.

Collapse
 
kordeviant profile image
Puria Kordrostami

thanks for updating the article. 💗