DEV Community

Dustin Ingram for Google Cloud

Posted on • Edited on

Using Headless Chrome with Cloud Run

I often see folks trying to use headless Chrome with services like Google Cloud Functions. The phrase "Headless Chrome" might sound very spooky, but it just means the regular Chrome browser, run without a GUI and instead interacted with programatically.

Unfortunately, the necessary Chrome binaries are not installed in the Cloud Functions runtime, and there isn't a way to modify the runtime besides installing Python dependencies.

However, one alternative would be to use Cloud Run, which lets you fully customize the runtime, including installing Chrome! So let's do that.

Configuring: Dockerfile

First, we'll create a Dockerfile. This uses the official Python base image, installs some additional dependencies, installs Chrome, and installs the dependencies for our application.



# Use the official Python image.
# https://hub.docker.com/_/python
FROM python:3.7

# Install manually all the missing libraries
RUN apt-get update
RUN apt-get install -y gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils

# Install Chrome
RUN wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
RUN dpkg -i google-chrome-stable_current_amd64.deb; apt-get -fy install

# Install Python dependencies.
COPY requirements.txt requirements.txt
RUN pip install -r requirements.txt

# Copy local code to the container image.
ENV APP_HOME /app
WORKDIR $APP_HOME
COPY . .

# Run the web service on container startup. Here we use the gunicorn
# webserver, with one worker process and 8 threads.
# For environments with multiple CPU cores, increase the number of workers
# to be equal to the cores available.
CMD exec gunicorn --bind :$PORT --workers 1 --threads 8 main:app


Enter fullscreen mode Exit fullscreen mode

Configuring: requirements.txt

This Dockerfile uses a requirements.txt file with specific versions of all our Python dependencies. We'll need to install selenium as well as the specific version of the chromedriver-binary project that corresponds with the version of Chrome that we've installed:



# requirements.txt

Flask==1.0.2
gunicorn==19.9.0
selenium==3.141.0
chromedriver-binary==77.0.3865.40.0


Enter fullscreen mode Exit fullscreen mode

Configuring: main.py

Finally, we'll write a Python application using Flask, Selenium



# main.py

from flask import Flask, send_file
from selenium import webdriver
import chromedriver_binary  # Adds chromedriver binary to path

app = Flask(__name__)

# The following options are required to make headless Chrome
# work in a Docker container
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("window-size=1024,768")
chrome_options.add_argument("--no-sandbox")

# Initialize a new browser
browser = webdriver.Chrome(chrome_options=chrome_options)


@app.route("/")
def hello_world():
    browser.get("https://www.google.com/search?q=headless+horseman&tbm=isch")
    browser.save_screenshot("spooky.png")
    return send_file("spooky.png")


Enter fullscreen mode Exit fullscreen mode

Testing & Deploying

If we have Docker installed locally, we can run this to test it:



$ docker build -t my_screenshot_service .
$ docker run --rm -p 8080:8080 -e PORT=8080 my_screenshot_service


Enter fullscreen mode Exit fullscreen mode

And view it at http://localhost:8080

Alt Text

Otherwise, we can deploy it directly to Cloud Run:



$ gcloud builds submit --tag gcr.io/YOUR_PROJECT/my_screenshot_service
$ gcloud beta run deploy my_screenshot_service --image gcr.io/YOUR_PROJECT/my_screenshot_service --region us-central1 --platform managed


Enter fullscreen mode Exit fullscreen mode

And that's it!

A few notes:

  • We're using --no-sandbox to ensure compatibility with the Docker container, so only point such a service towards URLs you trust.
  • Be careful when exposing such a service to user input: For example, if the URL we were screenshotting was supplied by the user, they could potentially take a screenshot of any file on the filesystem as well!
  • Be sure to create a new service account with no permission and use it as the identity of the service, for better security. See https://cloud.google.com/run/docs/securing/service-identity for an example.

Top comments (15)

Collapse
 
kev26 profile image
kev26

If my local code from windows machine (C:\Users\KEV\pathmycode), how will i set path for this case ? I tried on linux is OK, but with windows, i can't to set them ! I'm newbie and Thanks for your help !!!

Collapse
 
johannesoak profile image
Johannes Eklund

I ran into issues when deploying it to Cloud Run. The fix for me was too add that "wget" should be installed otherwise chrome wasn't able to be downloaded.

So this was the whole line I ran in the Docker file
RUN apt-get install -y wget gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils

Collapse
 
shreyasashtamkar profile image
Shreyas Ashtamkar

If you are using anything after python 3.7 you would face that issue, since it uses Debian Bullseye from when that package is unavailable.

try this -
FROM python: 3.8-buster

worked for me.

Collapse
 
alex512byte profile image
Alex512-byte

I am getting the following error when I try to implement this, is anyone else?

Update failed with error code BUILD_USER_ERROR
unable to stream build output: The command '/bin/sh -c apt-get install -y wget gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils' returned a non-zero code: 100. Please fix the Dockerfile and try again..
Exited with code 1.

Collapse
 
derek_p_baker profile image
Derek Baker

If anyone is interesting in an updated version of this content that outlines some pitfalls involved when running locally (outside of Docker), I threw this repo on Github: github.com/derek-baker/scraping-se...

Collapse
 
matti__4247fc773325 profile image
Matti Jensen

Local docker - I tried python 3.8-buster which solves the issue that Alex512* has.
But running the container still fails;

[2022-01-10 15:59:26 +0000] [1] [INFO] Using worker: threads
/usr/local/lib/python3.8/os.py:1023: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
return io.open(fd, *args, **kwargs)
[2022-01-10 15:59:26 +0000] [9] [INFO] Booting worker with pid: 9
[2022-01-10 15:59:27 +0000] [9] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()

Justin, would you be able to update/fix the guide - thanks!

Collapse
 
ahmetb profile image
Ahmet Alp Balkan • Edited

Nice post, thanks! When I deployed this by copy-pasting, I got HTTP 500. Logs show "Memory limit of 256M exceeded with 282M used. Consider increasing the memory limit." You might want to add this argument (and maybe make the URL configurable with ?url=).

Collapse
 
chongleichen profile image
chongleichen

This comment is for anyone finding that chrome keeps crashing inside their container, it could be due to the machine you're running on. For me it was an M1 mac. I found later that it works fine on Cloud Run itself. There may be some compatibility issues with M1 macs that I need to figure out.

Collapse
 
manishfoodtechs profile image
manish srivastava

Thanks Dustin, for this article

Collapse
 
sulun profile image
Emre

Saved my day, thanks!

Collapse
 
estbelari profile image
Esteban Beltrán

I did not know what version of chrome I have installed. It helped me install chromedriver-binary-auto. Automatically detects the required version.

pip install chromedriver-binary-auto

Thanks