In the fast-paced world of software development and deployment, every minute counts. Every second is important for streamlining your processes so you can reduce time spent waiting on build times or other lagging steps. When you are using Docker to manage your software builds and deployments, you want to make sure you are doing everything possible to simplify the process so that your developers don't get bogged down with lengthy build processes. That’s why multi-stage builds are such a helpful feature of Docker.
In the world of software development, you’ll soon discover that there is never one right way to do things. Instead, there are usually many ways; it comes down to finding the best solution for your situation and organization. Just like that, there are a number of different techniques out there when it comes to creating container images as efficiently as possible. One such approach is using multi-stage builds with Docker, which can help reduce your container size. This article explains what multi-stage builds are and how they can help speed up your development process.
Dockerfile
Containers allow you to package up an application with all the necessary parts, such as libraries and other dependencies and ship it all out as one package. The whole application can be converted into an image and pushed to an image registry such as DockerHub. To create an image, you need a Dockerfile. Dockerfile is a simple text document that contains all the commands and instructions to create a Docker image. It is written as a list of instructions for Docker to follow. The Dockerfile starts with an instruction to copy the contents of another file, called a base image, onto your computer. After this, you can add your own customizations accordingly, depending on the application you are working on. The Dockerfile is read by the Docker Engine, which then executes the instructions in order. The primary purpose of Dockerfile is to create an image that can be deployed as quickly as possible and with the fewest possible dependencies.
Docker Build
Docker is a containerization platform allowing developers to create portable, self-sufficient containers. The Docker build process starts with an image which is only a base layer of the final image. This means that the image contains only the operating system and any other packages needed to execute commands. The next step in this process is adding layers to this base layer using layers from other images or manually installing packages. A Dockerfile specifies all these steps in detail and can be used as input for Docker build process through the docker build command. The docker build command is used to create an image from a Dockerfile. The docker build command can be run with a tag to specify which version of the image should be created.
Docker build is the simplest single command that automatically generates an image with your desired configuration and dependencies specified in the Dockerfile.
Multi-Stage Docker Builds
Every microservice should be its own separate container. If you only use a single-stage Docker build, you’re missing out on some powerful features of the build process. On the other hand, a multi-stage Docker build has many advantages over a single-stage build for deploying microservices.
A multi-stage build is a process that allows you to break the steps in building a Docker image into multiple stages. This will enable you to create images that include only the dependencies that are necessary for the desired functionality of the final application, cutting down on both time and space. With a multi-stage build, you will first build the image that contains only the dependencies needed to build your application. Then, after the image has been built, you can add in any additional layers needed to create your application and configure it for deployment. In this way, you can build images with only the code necessary for building the application. This is also strategically used to optimize the container images and make them smaller.
As mentioned above, multi-stage builds let you create optimized Docker images with only the dependencies necessary to build your application. Combined with Docker’s layered images, this can help you save significant space. The multi-stage process saves space on your Docker host and in the Docker image and speeds up the build process. In addition, the process will be much quicker than it would be if you included all the code needed to build your application.
Creating two Dockerfiles; one for development and one for production is not considered ideal in the DevOps world and that is where multi-stage Docker builds come handy as we can have one optimized Dockerfile created for all the environments - Dev, Staging and Production.
Multi-Stage Docker Build Examples
Java Example:
To understand the concept of Multi-stage Docker builds better, let us consider a simple Java Hello World application.
Add the following code in a file named HelloWorld.java
class HelloWorld {
public static void main(String[] a) {
System.out.println("Hello world!");
}
}
Then, create a Dockerfile with the following content in it,
FROM openjdk:11-jdk
COPY HelloWorld.java .
RUN javac HelloWorld.java
CMD java HelloWorld
Build the image with the following command,
docker build -t helloworld:huge .
Let’s modify our Dockerfile with the following content to show how multi-stage Docker build works.
FROM openjdk:11-jdk AS build
COPY HelloWorld.java .
RUN javac HelloWorld.java
FROM openjdk:11-jre AS run
COPY --from=build HelloWorld.class .
CMD java HelloWorld
Build the image with the following command,
docker build -t helloworld:small .
Now, let’s compare both images. Check the images created with the following command,
docker images
Hope you can see the difference in size between the two images. This way, you can separate the build and runtime environments in the same Dockerfile. Use build environment as a dependency [COPY --from=build HelloWorld.class .
] while creating the Dockerfile with the approach of multi-stage docker build. This will help minimize the size of Docker images.
Node.Js Example
Let’s learn with a simple NodeJs application that has a simple Dockerfile.
FROM node:14-alpine
ADD . /app
WORKDIR /app
COPY package.json .
RUN npm install --production
COPY . .
EXPOSE 3002
CMD [ "node", "app.js" ]
Let’s build the image with the following command,
docker build -t [DockerHub username]/image name:tag
Push the image to Docker Hub with the command,
docker push [DockerHub username]/image name:tag
I pushed the image to DockerHub, and here is the image and size below,
Now, let’s try using the concept of multi-stage Docker build and modify our existing Dockerfile.
FROM node:14-alpine as base
ADD . /app
WORKDIR /app
COPY package.json .
RUN npm install
FROM alpine:latest
COPY --from=stage1 /app /app
WORKDIR /app
EXPOSE 3002
CMD [ "node", "app.js" ]
Let’s build and push the image with the similar commands used above. Just make sure to give a different name to the image.
Now, compare the image sizes. One with the usual Dockerfile is 48.81 MB, and the other created with a multi-stage Docker build is 7.12 MB. Can you see the difference? The image created by the multi-stage Docker build approach is more optimized.
Another example that shows how multi-stage Docker builds can be used efficiently is a scenario where you like to dissect the Dockerfile for different environments.
A normal Dockerfile looks as below,
FROM node:14-alpine
WORKDIR /src
COPY package.json package-lock.json /src/
RUN npm install --production
COPY . /src
EXPOSE 3000
CMD ["node", "bin/www"]
We will create 3 simple stages from the above Dockerfile.
- Base stage: This stage will have things in common with the original Dockerfile
- Production stage: This stage will include things useful for the production environment
- Dev stage: This stage will have components useful for the Dev environment
The modified Dockerfile looks as below,
FROM node:14-alpine as base
WORKDIR /src
COPY package.json package-lock.json /src/
EXPOSE 3000
FROM base as production
ENV NODE_ENV=production
RUN npm ci
COPY . /src
CMD ["node", "bin/www"]
FROM base as dev
ENV NODE_ENV=development
RUN npm install -g nodemon && npm install
COPY . /src
CMD ["nodemon", "bin/www"]
Some notable advantages of using a multi-stage build,
- Optimizes the overall size of the Docker image
- Removes the burden of creating multiple Dockerfiles for different stages
- Easy to debug a particular build stage
- Able to use the previous stage as a new stage in the new environment
- Ability to use the cached image to make the overall process quicker
- Reduces the risk of vulnerabilities found as the image size becomes smaller with multi-stage builds
Deploying Applications with Harness
Sign up for a free trial of Harness and select the TryNextGen tab for a seamless experience. Create a new project and select the Continuous Delivery module. Start creating a new pipeline, and add all the details that your pipeline needs.
Note: For Harness to do its magic, you need something called a ‘Delegate’ to be running on your Kubernetes cluster.
What is Harness Delegate?
The Harness Delegate is a service/software you need to install/run on the target cluster [Kubernetes cluster in our case] to connect your artifacts, infrastructure, collaboration, verification and other providers with the Harness Manager. When you set up Harness for the first time, you install a Harness Delegate.
We will not dig deeper about Delegate in this article as it can be a separate blog in itself. For now, just know that the Delegate performs all deployment operations for you. If you want to know more about Delegate, you can read here.
Next, specify the service, infrastructure and deployment strategy for your application. Once everything is set, save the configuration and run to deploy the application.
There you go! Once the pipeline runs successfully, you should see your application deployed on the specified Kubernetes cluster. That can be verified via the kubectl command ‘kubectl get pods’. Would you like to try Harness CD? sign up for the Harness CD free trial.
Conclusion
Multi-stage builds help build optimized Docker images that can run anywhere. If streamlining software delivery is one of your goals, then you should definitely understand how multi-stage Docker builds work. The software deployments can be faster through this approach, and the image can be reused to save time and effort. Multi-stage builds are a great way to simplify your image creation process and save developers time.
In the cloud-native world, security is considered of high importance. One excellent benefit of multi-stage Docker builds is that it reduces the number of dependencies and unnecessary packages in the image, reducing the attack surface. In addition, it keeps it clean and lean by having only the things required to run your application in production. Else, we all end up building and pushing images that are large in size with vulnerabilities that can give an easy way to attackers to get into our applications. Try using multi-stage Docker builds for optimized images and security. Hope this article helped you learn more about multi-stage Docker builds and why we should use them.
Top comments (1)
In the node alpine multistage build example you provided,
you are using only alpine as base image in second stage
and at the end, you are running a node command
If the base image is only alpine and does not include node, how will it get node binary to run node commands?