Jan Schulte for Outshift By Cisco

Posted on Jul 21, 2023

Ship smaller Docker images - Best practices

#docker #kubernetes #devops

As a follow-up to Building Secure Docker Images for Production, this blog post investigates what you can do to optimize building Docker images for build time and size.

Special thanks to Ed Warnicke for the suggestion and input leading to this blog post.

The problem

You're working on a new microservice. The first production deployment is scheduled. Production deployments require a Docker image.
Time to start building a simple Dockerfile:

FROM rust:1.69.0

WORKDIR /usr/app
RUN USER=root cargo new --bin hello
WORKDIR /usr/app/hello
COPY ./Cargo.toml  ./Cargo.toml

RUN cargo build --release
RUN rm src/*.rs
COPY ./src ./src
RUN rm ./target/release/deps/hello*
RUN cargo build --release
CMD [ "./target/release/hello" ]

The main goal is to get the application to compile and run. We use one of the official Docker images with the toolchain we need already pre-installed.

A quick docker run confirms this image works. But should you push this into a registry as is? Probably not.
A glance at docker images confirms it:

$ docker images
REPOSITORY TAG    IMAGE ID     CREATED     SIZE
largeimage latest da536ad251e2 6 weeks ago 2.25GB

This image is over 2GB. That's a lot of bytes for a statically compiled application.

If we took inventory of what's in the image, we'd find we're shipping the entire Rust toolchain and any intermediate build files in target/.
Neither is needed in production.

Large images are problematic

There are many reasons why we don't want to ship large images.

We waste storage space. Even if storage is generally affordable, it's unnecessary.
We're shipping unnecessary files, such as compilers or intermediate/temporary build files. A production image does not need the compiler present. If we need to recompile the application, we do so as part of our CI/CD pipeline.
A significant hit on Pod Startup Time. Large images take longer to pull, which takes a toll on scale-up and healing if replicas die.

How Docker builds an image

Before diving into a practical example, we must understand how Docker approaches building images.
We could easily assume Docker images work the same way as virtual machine images. We install what we need, resulting in a large file we ship.

The reality couldn't be more different.
If you run docker build several times in a row, you'll notice the first invocation takes a while to finish, while all subsequent runs only take a fraction.

Why is that?

When Docker builds a new image, it creates it layer by layer.
Every line in a Dockerfile results in an additional layer. If a line hasn't changed, there's no need to rebuild the layer.

Consider this Dockerfile:

FROM ubuntu:latest
RUN apt-get update
RUN apt-get install -y vim
RUN rm -rf /var/apt/lists*

We use ubuntu, update package sources, install vim, and, once completed, clean up apt package caches, etc.

docker build -t layers .
[+] Building 23.1s (9/9) FINISHED
 => [internal] load .dockerignore                                                                                            0.1s
 => => transferring context: 2B                                                                                              0.0s
 => [internal] load build definition from Dockerfile                                                                         0.1s
 => => transferring dockerfile: 168B                                                                                         0.0s
 => [internal] load metadata for docker.io/library/ubuntu:latest                                                             3.0s
 => [auth] library/ubuntu:pull token for registry-1.docker.io                                                                0.0s
 => [1/4] FROM docker.io/library/ubuntu:latest@sha256:0bced47fffa3361afa981854fcabcd4577cd43cebbb808cea2b1f33a3dd7f508       6.2s
 => [2/4] RUN apt-get update                                                                                                 5.6s
 => [3/4] RUN apt-get install -y vim                                                                                         7.4s
 => [4/4] RUN rm -rf /var/apt/lists*                                                                                         0.3s
 => exporting to image                                                                                                       0.6s
 => => exporting layers                                                                                                      0.6s
 => => writing image sha256:6dcadf381f8e3b7ee143e818884a9f2a773a23a11bdac78831671b8fcb10d233                                 0.0s
 => => naming to docker.io/library/layers                                                                                    0.0s
$ docker images
REPOSITORY   TAG       IMAGE ID       CREATED          SIZE
layers       latest    6dcadf381f8e   18 seconds ago   179MB

We want to confirm the number of layers present:

$ docker inspect 6dcadf381f8e | jq ".[0].RootFS.Layers"

docker inspect outputs a lot of information, but we're only interested in the number of layers. Therefore, we're using jq to reduce the output to the essential bits and pieces:

[
"sha256:59c56aee1fb4dbaeb334aef06088b49902105d1ea0c15a9e5a2a9ce560fa4c5d",  "sha256:c15e21155336e02611e896a2a73e93db8d27c903aa6fff59b1cc5956669b4119",  "sha256:811f5dabaddf01eb6d50d6d54da46f18acbeaba9925051ad2ac3d69b91af500f", "sha256:5f70bf18a086007016e948b04aed3b82103a36bea41755b6cddfaf10ace3c6ef"
]

The result: Four layers.

Let's retry this experiment with the following Dockerfile:

FROM ubuntu:latest
RUN apt-get update && \ 
    apt-get install -y vim && \
    rm -rf /var/apt/lists*

Result:

docker inspect fc57434f273e | jq ".[0].RootFS.Layers"
[
  "sha256:59c56aee1fb4dbaeb334aef06088b49902105d1ea0c15a9e5a2a9ce560fa4c5d",
  "sha256:136403671b74fc503d5d4c2e08c8ae99ab461390f448983adb9f1e86197e80cf"
]

We still run the same commands, now only with two layers, though!

Why would you consider using fewer layers, though?

Comparing image sizes, both use the same amount of space. Fewer layers make a difference once the image needs to be (re-)built.
Fewer layers mean faster builds. Also, if a higher-up layer changes, it invalidates all the following layers.

A Case Study

The following content is based on this GitHub issue and Dockerfile:

ARG VPP_VERSION=e416893a597959509c7f667c140c271c0bb78c14
ARG UBUNTU_VERSION=20.04
ARG GOVPP_VERSION=v0.3.5

FROM ubuntu:${UBUNTU_VERSION} as vppbuild
ARG VPP_VERSION
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive TZ=US/Central apt-get install -y git make python3 sudo asciidoc
RUN git clone https://github.com/FDio/vpp.git
WORKDIR /vpp
RUN git checkout ${VPP_VERSION}
COPY patch/ patch/
RUN test -x "patch/patch.sh" && ./patch/patch.sh || exit 1
RUN DEBIAN_FRONTEND=noninteractive TZ=US/Central UNATTENDED=y make install-dep
RUN make pkg-deb
RUN ./src/scripts/version > /vpp/VPP_VERSION

#------

FROM vppbuild as version
CMD cat /vpp/VPP_VERSION

#------

FROM ubuntu:${UBUNTU_VERSION} as vppinstall
COPY --from=vppbuild /var/lib/apt/lists/* /var/lib/apt/lists/
COPY --from=vppbuild [ "/vpp/build-root/libvppinfra_*_amd64.deb", "/vpp/build-root/vpp_*_amd64.deb", "/vpp/build-root/vpp-plugin-core_*_amd64.deb", "/vpp/build-root/vpp-plugin-dpdk_*_amd64.deb", "/pkg/"]
RUN VPP_INSTALL_SKIP_SYSCTL=false apt install -f -y --no-install-recommends /pkg/*.deb ca-certificates iputils-ping iproute2 tcpdump iptables; \
    rm -rf /var/lib/apt/lists/*; \
    rm -rf /pkg

#------

FROM ubuntu:${UBUNTU_VERSION} as vpp
COPY --from=vppinstall / /

#------

FROM vpp as vpp-dbg
WORKDIR /pkg/
COPY --from=vppbuild ["/vpp/build-root/libvppinfra-dev_*_amd64.deb", "/vpp/build-root/vpp-dbg_*_amd64.deb", "/vpp/build-root/vpp-dev_*_amd64.deb", "./" ]
RUN VPP_INSTALL_SKIP_SYSCTL=false apt install -f -y --no-install-recommends ./*.deb

#------

FROM golang:1.15.3-alpine3.12 as binapi-generator
ENV GO111MODULE=on
ENV CGO_ENABLED=0
ENV GOBIN=/bin
ARG GOVPP_VERSION
RUN go get git.fd.io/govpp.git/cmd/binapi-generator@${GOVPP_VERSION}

#------

FROM alpine:3.12 as gen
COPY --from=vpp /usr/share/vpp/api/ /usr/share/vpp/api/
COPY --from=binapi-generator /bin/binapi-generator /bin/binapi-generator
COPY --from=vppbuild /vpp/VPP_VERSION /VPP_VERSION
WORKDIR /gen
CMD VPP_VERSION=$(cat /VPP_VERSION) binapi-generator ${PKGPREFIX+-import-prefix ${PKGPREFIX}}

The following sections highlight some interesting aspects you can apply to your next Docker image.

Let's break it down.

Multiple `FROM` statements

When perusing this file, several lines start with FROM. The Dockerfile syntax only allows one FROM at the beginning of the file - unless we're using a multi-stage build.
A multi-stage build pays off when the project has a complex build process, but we only want to ship a lean image containing the artifact plus runtime dependencies.
Multi-stage builds allow us to orchestrate the build process over several stages while only shipping the output of the last step.
The Dockerfile above heavily uses build stages to optimize and control what ends up in the final image.

Quoting Ed Warnicke:

Building vpp means bloating an image

up with a bunch of build dependencies, build artifacts, etc. Building vpp installation on top of that would

lead to a multi-GB image, which is undesirable. So we isolate that work in the 'vppbuild' stage.

Source: govpp#16

Looking at the first stage in the Dockerfile, we see the installation of build dependencies and any other necessary steps of the build process.
Subsequent stages copy the files that are needed from their predecessors.

The result: The final Docker image does not contain any trace of the build environment or the installation steps.

Reusing `/var/apt/lists`

Looking more closely at line 22, we see COPY --from, which copies files and directories from the previous stage.
Example:

FROM ubuntu:${UBUNTU_VERSION} as vppinstall
COPY --from=vppbuild /var/lib/apt/lists/* /var/lib/apt/lists/ #(1)
COPY --from=vppbuild [ "/vpp/build-root/libvppinfra_*_amd64.deb", "/vpp/build-root/vpp_*_amd64.deb", "/vpp/build-root/vpp-plugin-core_*_amd64.deb", "/vpp/build-root/vpp-plugin-dpdk_*_amd64.deb", "/pkg/"]

Instead of copying build artifacts, it copies apt indices (the result of apt-get update).

Copying these files saves some time as we can avoid to re-run apt-get update, which usually gets executed before apt-get install.

Only install necessary packages when `apt-get install`'ing

A few things happen behind the scenes when you run apt-get install <package>. To shed some more light on the behavior, open a terminal and run this:

$ docker run --rm -it ubuntu:jammy /bin/bash
root@1a3eebf01ee7:/# apt-get update
<...omitted...>
root@1a3eebf01ee7:/# apt-get install vim
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libexpat1 libgpm2 libmpdec3 libpython3.10 libpython3.10-minimal libpython3.10-stdlib libreadline8 libsodium23 libsqlite3-0
  media-types readline-common vim-common vim-runtime xxd
Suggested packages:
  gpm readline-doc ctags vim-doc vim-scripts
The following NEW packages will be installed:
  libexpat1 libgpm2 libmpdec3 libpython3.10 libpython3.10-minimal libpython3.10-stdlib libreadline8 libsodium23 libsqlite3-0
  media-types readline-common vim vim-common vim-runtime xxd
0 upgraded, 15 newly installed, 0 to remove and 0 not upgraded.

In the output, you notice how apt lets us know about required dependencies and suggested packages. If we want vim to work correctly, there's no way around the required packages.

Suggested packages, on the other hand, may not be needed.
Besides suggested packages, sometimes we also encounter recommended packages.

Debian defines them as such:

Recommends
This declares a strong, but not absolute, dependency.
The Recommends field should list packages that would be found
together with this one in all but unusual installations.
Suggests
This is used to declare that one package may be more useful with
one or more others. Using this field tells the packaging system
and the user that the listed packages are related to this one
and can perhaps enhance its usefulness, but that installing this
one without them is perfectly reasonable.

Source: https://lists.debian.org/debian-mentors/2007/08/msg00037.html

For instance, the package virtualbox comes with recommended packages.

While this behavior can benefit a desktop environment, it can bloat up Docker images.
To make sure we only install what is needed, we can use --no-install-recommends:

RUN VPP_INSTALL_SKIP_SYSCTL=false apt install -f -y --no-install-recommends /pkg/*.deb ca-certificates iputils-ping iproute2 tcpdump iptables; \
    rm -rf /var/lib/apt/lists/*; \
    rm -rf /pkg

This flag skips all recommended and suggested packages.

What are your thoughts?

What other steps do you take to ship smaller Docker images?
Share your tips and tricks in the comments.

Top comments (1)

Kyle Quest • Oct 13 '23

Have you tried using SlimToolkit (aka DockerSlim) with your images? It's supposed to be the easy way to minify container images. Always trying to expand support for new application types and designs, so if it doesn't work I'll be happy to help with it. There's a whole bunch of examples here for different application stacks and base images: github.com/slimtoolkit/examples