In this post, you’ll discover how to create small and secure container images using the multi-stage docker build feature with just one Dockerfile.3
Thanks to Docker, creating images has never been simpler, you just need to put a standard Dockerfile into your source folder, specify a base image, add your code and build your image running the docker “build” command, and shazam 🏄! Your container image is built!. That’s awesome, we can even dockerize legacy application in an easier way without re-architecting.
The downside of this simplicity is that it’s easy to build huge containers full of things you really don’t need. Most Docker image uses Debian or Ubuntu as the base image. While this is great for compatibility and easy onboarding, these base images can add overhead to your container. For example, a simple hello world app in Node.js is almost 700MB, and as you probably already know this example is only a few MB in size. So the additional overhead is wasted space and a great place for security vulnerabilities and bugs.
The process for creating images is different depending on whether you are using an interpreted language or a compiled language. So, let’s dive in it!
As you know interpreted languages send the source code through an interpreter that runs the code directly. This gives you the benefit of skipping the compilation step but it has the downside of requiring you to ship the interpreter along with the code. Luckily, most of these languages offer pre-built Docker images that includes a lightweight environment that allows you to run much smaller containers.
Let’s take this Node.js app as an example and instead of building using the “node:onbuild” Docker base image use the “node:alpine” version that is smaller. This version removes many files and programs, leaving only what you need to run your app. Alpine Linux is a small and lightweight Linux distribution that is very popular with Docker users because it’s compatible with a lot of apps, while still keeping containers small. Why Alpine images are so small, and their other advantages, are explained in detail here.
As you can see, going from the default node image to a smaller base image such as Alpine, can significantly cut down on the size of your container by 10 times.
Now think about statically compiled languages, the source code is turned into compiled code beforehand. So, the compilation steps often require tools that are not actually needed to run the code. And then, the size of the resulting image is 4–10x bigger than the size needed to run our application, without mentioning the long time that the building would take.
A common workaround is to use the Builder Pattern. It involves using two or more Docker images. The code is built in the first container (development, build tools), and then the compiled code is packaged in the final container (production, runtime) without all the compiler and tools required to compile the code. As a drawback, you should maintain 2+ Dockerfiles, even though this is not ideal and you will need an orchestration of it using additional tools such as bash scripting or YAML files.
At the DockerCon 2017, a new feature called multi-stage builds was introduced on Docker (version 17.05 or higher) that helps to create multiple intermediate images from the same Dockerfile. This concept gives us the benefits of the builder pattern without the hassle of maintaining separate files.
With multi-stages you can use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different image base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another using the COPY command, leaving behind everything you don’t want in the final image.
This is very useful for example to not include your application build dependencies in your final image, allowing you to have a much smaller image. You can read more about Multi-stage builds here. Some of the benefits are:
- One Dockerfile
- One syntax to learn
- Same build (development & production)
- Works on Local machine and CI server
- Can create multiple stages pipelines
So let’s tackle a Go Dockerfile as an example:
As you can see, it is possible to name the building stages and then reference them. By default, the stages are not named and you should use numbers starting with 0 for the first FROM instruction. You are not limited to copy from stages you’ve created earlier in your Dockerfile. You can use the COPY — from instruction to copy from a separate image, either using the local image name, a tag available locally or on a Docker registry, or an ID tag.
$ COPY — from=sampleapp:latest …/config.json app/config.json
Having a single binary in production image is great, but what about development? You will probably need your build dependencies to be present, and it’s recommended to have the same Dockerfile for both production and development. The trick is to use target flag of the build command that allows you to specify which stage you want to stop your build. For example, the following command assumes you are using the previous Dockerfile but stops at the stage named build-env:
$ docker build — target build-env -t <image:version> .
It could be very useful for debugging a specific build stage and we can even create pipelines like this:
Now, do small containers actually have a measurable advantage? The answer is yes. To figure it out, you can take a look at two areas where small containers shine: performance and security. Talking about performance, you can consider the time to build, push it to a registry and then pull it down from the registry. In order to realize about security improvements, you can check your containers using google vulnerability scanning for example.
Until docker 17.05 the builder pattern was effective as a workaround, but since this docker version, multi-stage build is a great way to create small images. Using docker multi-stage build we can also create advanced pipelines. The main info was retrieved from here. If you also want to play online with multi-stage feature you can use this.
If you have any questions, concerns or if you simply want to add your thoughts, you are kindly welcome to do that in the comments box below. You also can reach me on LinkedIn 😊
Top comments (3)
What ssh client or package are you using to get that pretty command prompt with those nice colors like in your first screenshot?
Hi Jonathan, I'm use ZSH! You can take a look here
FROM instruction in the Go Dockerfile screenshot is mentioned as "goland" instead of "golang". Just an observation :)