Today I will tell you a story behind making Jupyter Docker Stacks able to build aarch64
images without many (almost any) compromises.
If you want to read how it is implemented now, you can skip to the last section of this post.
Jupyter Docker Stacks are a set of ready-to-run Docker images containing Jupyter applications and interactive computing tools. You can use a stack image to do any of the following (and more):
- Start a personal Jupyter Server with the JupyterLab frontend (default)
- Run JupyterLab for a team using JupyterHub
- Start a personal Jupyter Notebook server in a local Docker container
- Write your own project Dockerfile
More info in the documentation.
First, a short dive in history
Jupyter Docker Stacks images were first introduced on 2015–07–19 and consisted of three images: minimal-notebook
, r-notebook
and scipy-notebook
.
From the very beginning these images were not independent and minimal-notebook
was a parent image for r-notebook
and scipy-notebook
images.
This is how it looked like back then:
A lot of things have changed since then and this is how it looks now:
The oldest version of the minimal-notebook
image pushed to DockerHub seems to be pushed on Oct 24, 2015. We won’t talk here about switching the CI system, adding new images and changing the base image. Instead, we will focus on how to make a set of docker images support a new architecture.
The build process was based on Makefile mostly. We were simply running docker build
commands in the right order on one machine. This is how it essentially looked:
OWNER?=jupyter
ALL_IMAGES:= \
minimal-notebook \
scipy-notebook
build/%:
docker build -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) --build-arg OWNER=$(OWNER)
test/%:
python3 -m tests.run_tests --short-image-name "$(notdir $@)" --owner "$(OWNER)"
build-test-all: $(foreach I, $(ALL_IMAGES), build/$(I) test/$(I) )
push/%:
docker push --all-tags $(DARGS) $(OWNER)/$(notdir $@)
Since around 2019, our users were actively asking if it is possible to build ARM images as well. As of 2022, we’ve fully accomplished this.
Unique aspects of our Docker images
Before we get to the implementation, there are some specifics to our images, that are not widely common for other Docker images:
- these images are the final product of this repository, so we have to test these images
- we have many images and they do depend on each other
- we do not build our own software, but these images are mostly installing 3rd party software and configuring it
- we tag the images by the version of this software and we tag images a lot (for example, if we install a specific version of
spark
in thepyspark-notebook
image, we will use the version we installed as a tag).
These things make it more difficult to implement multi-arch images.
The process
First, we made sure our Docker images were building just fine on the aarch64
platform.
This is (not a complete) list of the things we did:
Do not inherit from image with a fixed SHA tag, this won’t work.
Instead ofFROM ubuntu:focal-20210609@sha256:376209...
it becameFROM ubuntu:focal-20210609
in this Pull Request.
This is slightly less secure, but seems to be a working solution.
We’ve also recently moved to simpleFROM ubuntu:focal
to get image automatically whenever we build our images.Get rid of arch hardcode in the Dockerfiles.
This is how we make it work formicromamba
installer:
RUN set -x && \
arch=$(uname -m) && \
if [ "${arch}" = "x86_64" ]; then \
# Should be simpler, see <https://github.com/mamba-org/mamba/issues/1437>
arch="64"; \
fi && \
wget "https://micromamba.snakepit.net/api/micromamba/linux-${arch}/latest"
Some python packages we were installing were missing pre-built package for
linux-aarch64
. We asked to add the missing support.When it took a lot of time to add this support to the external package, we simply didn’t install it for the
aarch64
platform.
For example, we still only installr-tidymodels
onx86_64
:
# `r-tidymodels` is not easy to install under arm
RUN set -x && \
arch=$(uname -m) && \
if [ "${arch}" == "x86_64" ]; then \
mamba install --quiet --yes \
'r-tidymodels' && \
mamba clean --all -f -y && \
fix-permissions "${CONDA_DIR}" && \
fix-permissions "/home/${NB_USER}"; \
fi;
- If it is not currently possible to provide the image without substantial changes, we don’t provide an
aarch64
image. For now, there is noaarch64
tensorflow Linux package available on conda-forge, so we don’t provide this image.
The first try
I will briefly describe how we initially implemented multi-arch images and the reasons why it didn’t work well in the end.
Initially, we decided to use docker buildx build
command to build the images as this is the recommended way to build multi-platform images.
To make this work we installed QEMU in GitHub runner and enabled buildx
builds.
This is how it looked like (simplified version):
OWNER?=jupyter
MULTI_IMAGES:= \
base-notebook \
minimal-notebook
AMD64_ONLY_IMAGES:= \
r-notebook \
scipy-notebook
build-multi/%:
docker buildx build $(DARGS) -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) --build-arg OWNER=$(OWNER)
docker buildx build $(DARGS) -t build-multi-tmp-cache/$(notdir $@):latest ./$(notdir $@) --build-arg OWNER=$(OWNER) --platform "linux/amd64,linux/arm64"
build-all-multi: $(foreach I, $(MULTI_IMAGES), build-multi/$(I)) $(foreach I, $(AMD64_ONLY_IMAGES), build/$(I))
push-multi/%:
docker buildx build $(DARGS) $($(subst -,_,$(notdir $@))_EXTRA_TAG_ARGS) -t $(OWNER)/$(notdir $@):latest ./$(notdir $@) --build-arg OWNER=$(OWNER) --platform "linux/amd64,linux/arm64"
push-all-multi: $(foreach I, $(MULTI_IMAGES), push-multi/$(I)) $(foreach I, $(AMD64_ONLY_IMAGES), push/$(I))
There were some limitations on docker buildx build (using docker/buildx 0.5.1, this is what we used back then):
- Can’t
--load
and--push
at the same time - Can’t
--load
multiple platforms
What does the command --load
mean?
It means that the built image can be referenced by docker
CLI, for example when using the docker tag
or docker push
commands.
Workarounds due to limitations:
- We always build a dedicated image using the current system named as
OWNER/<stack>-notebook
so we always can reference that image no matter what during tests etc. - Also, we always build a multi-platform image during build-multi that will be inaccessible with
docker tag
anddocker push
etc, but this will help us test the build on the different platform and provide cached layers for later. - We let push-multi refer to rebuilding a multi image with
--push
. We can rely on the cached layer from build-multi now even though we never tagged the multi image.
There were a few things that went wrong and they were kind of expected:
- We expected the builds to be slower, due to usage of QEMU (but when we started to add
aarch64
support for more images, we had around 10x slowdowns and our builds were taking ~3 hours). - We were tagging
aarch64
images as if they had the same tags asx86_64
- We did not test
aarch64
images - We were not creating build manifests for these images
The worst thing was that no one knew how to overcome these issues with this approach. There were some things that didn’t work and were unexpected at all:
- There is a QEMU bug, when child process hangs when forking due to glib allocation. As a workaround, set
G_SLICE=always-malloc
in the QEMU guest. This meant that we had to add some workarounds to our Dockerfiles. You can also take a look at the issue, where it first appeared and how it was mitigated in the upstream project. - Some software doesn’t work properly under QEMU. We were not able to run
Julia
in an emulated environment.
So, we decided to use another approach.
How we did it
So we decided to:
- use self-hosted
aarch64
GitHub runners and don’t use QEMU at all - use simple
docker build
commands - use many GitHub jobs to achieve better concurrency
- to pass the images between different jobs we use
docker save
anddocker load
commands and upload/download the corresponding files using well-made GitHub actionsactions/upload-artifact
andactions/download-artifact
- use
aarch64
runners only for the things that must be done onaarch64
platform. For example, we build the image, test it, compute tags and create build manifests on our self-hosted runners, but the actual tagging and pushing build manifests is done on GitHub provided runners. - heavily rely on GitHub reusable workflows and local actions. This way there is almost zero code duplication.
- one particular GitHub workflow does as little as possible.
It took me a few months of work, 137 commits and lots of hours reading about Docker and GitHub workflows, runners and actions. If you want to see the implementation, here is the PR.
Before we dive into the details, I would like to point out, that there are few things to know about self-hosted GitHub runners.
- The workflows are run on the host machine by default (not inside an isolated environment), so you need to make sure, you don’t preserve an environment between different runs, especially when workflow fails. I recommend cleaning your environment at the start of the workflow — for example, we do
docker system prune --all --force
whenever we need to usedocker
and it’s cache from the previous runs is not desirable. - Each self-hosted runner installation only allows one run simultaneously. You can install several runners on one machine, if you want to, but you’ll have to deal with build caches and artefacts.
- I would strongly suggest to require an approval for all outside collaborators. Because PRs are run in GitHub self-hosted environment as well, making them run without an approval will harm the security of your project.
We have several GitHub workflows which let us build all our images:
-
docker.yml
— is a root workflow, it runs some jobs (which call reuseable workflows) and manages their order. This file can be seen both as config, and GitHub based build system. -
docker-build-test-upload.yml
— this is a reusable workflow to download parent image, build a new image, test and upload it. This is the only place, where self-hosted runners are used. We also compute tags and write build manifest files in this workflow. Note: This workflow only manipulates with one-platform image. -
docker-tag-push.yml
is used to download built images (from GitHub artifacts), tag these images (using already computed tags) and then push these images (with correct tags). Note: This workflow only manipulates with one-platform image. We also addarch
prefix for all the tags. For example, we pushjupyter/base-notebook:aarch64-python-3.10.8
tag (and most likely, though not guaranteedjupyter/base-notebook:x86_64-python-3.10.8
). -
docker-merge-tags.yml
— this is actually the place, where we try to create multi-platform images and merge tags. To do this, we mostly usedocker manifest
subcommands. -
docker-wiki-update.yml
downloads build manifests and pushes them to GitHub Wiki page.
Now you know the high-level implementation of our build process and how you can build multi-arch Docker images.
Top comments (2)
Could you please tell me, is it possible to use this approach when I want to build my image for 3 architectures: x64, arm and arm64?
Yes, you definitely can use this approach.
You will need to add both arm and arm64 self-hosted runners to GitHub.
Here is the documentation, which will guide you through the process.
Then you will need to write a script to merge tags for different architectures.
github.com/jupyter/docker-stacks/b...
Here is how I made it work in our project, I think you can use it as an example.