Sometimes, using a custom Dockerfile results in massive images, and, yes, they are a bunch of layers that will reproduce an isolated environment, their huge size makes sense.
Today, I am going to show you a way to save many space with a simple and cool Docker trick.
Dockerfiles
When it comes to create a custom image, you will need to write a Dockerfile, as you might know, each sentence in your file represents a new layer, let’s see a quick example.
FROM ubuntu:latest # Pulls latest ubuntu image
RUN apt update # Updates apt repositories
RUN apt install figlet # Installs figlet
CMD ["figlet", "-c", "Hello from dockerized ubuntu!"]
# Runs!
If you try to build and run the previous Dockerfile
docker build -t hello .
docker run hello
The output will be something like
Well, the output is pretty, but let’s see how much space does my hello
image use.
docker image ls
The output is
REPOSITORY TAG IMAGE ID CREATED SIZE
hello latest 3d0b79dcdd1f 10 minutes ago 109MB
109MB
to print a hello world in my console using Ubuntu, this is actually huge, now, think about how much space a real-life application Docker image will use.
Single-stage Dockerfile
For this example, we will use a super simple Go REST API, the code is already done, and you can find it here.
Since the scope of this blog post is how to reduce the size of your Docker images, I will not explain the Go code, but I promise to create a post about that in the future.
In the repository for this post you will find two Dockerfiles, let’s take a look at Dockerfile
first
FROM golang:latest
WORKDIR /server
COPY . .
RUN go mod download
RUN go build -o server .
CMD ["./server"]
The image built from this Dockerfile will
- Pull the latest
golang
Docker image - Create a directory named
/server
and change the context to it - Copy all the files in the current directory into our container
- Execute
go mod download
to install all the project’s dependency - Execute
go build -o server .
To build the app and rename the bin toserver
- Finally, execute the
server
binary every time we run a container from this image
Quite straightforward, right? Let’s build the image and take a look at the size, for this you can run the following commands.
docker build -t GoServer .
docker image ls
The output should be something similar to
REPOSITORY TAG IMAGE ID CREATED SIZE
hello latest 3d0b79dcdd1f 12 minutes ago 109MB
GoServer latest 34e3ad5acd86 21 minutes ago 1.02GB
1.02GB
is being used to isolate my server environment, it can be shocking, but let’s take a look at what it contains.
- All my project dependencies
- All Go dependencies
- A lightweight Linux distro
- The code of my app
- The final binary of my app
So, it actually stores many things that we only use once, for example, since I already compiled my project, maybe storing all the dependencies is actually unnecessary, just as all the code of my app, and all the Go dependencies.
Then, we can get rid of some items of the above list and keep it like
All my project dependenciesAll Go dependencies- A lightweight Linux distro
The code of my app- The final binary of my app
But, how do we achieve this?
For this, Docker actually has a solution, a different strategy of building Dockerfiles, splitting them in STAGES
to reduce the quantity of layers.
Multi-stage Dockerfiles
In the project’s repository you have another Dockerfile, called Dockerfile.multi
the extension .multi
is not actually necessary, I just used it to differentiate it from the previous Dockerfile, let’s take a look at it
# Build stage
FROM golang:latest AS builder
WORKDIR /build
COPY . .
RUN go mod download
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o server .
# Run stage
FROM alpine:latest
WORKDIR /server
COPY --from=builder /build/server .
CMD ["./server"]
It might look really complex, since it uses FROM
twice, but it is actually smaller and easier to read (with practice).
What will this Dockerfile do when we build the image?
- Pull latest
golang
Docker image and give it thebuilder
alias - Create a directory named
/build
and change the context to it - Copy all the files in the current directory into our container
- Execute
go mod download
to install all the project’s dependency - Execute
go build -o server .
To build the app and rename the bin to server, with someGO FLAGS
enabled to make our binary executable in different environments.
Ok, we will take a break here, at this point in the Dockerfile
- We copied the code
- Installed the project dependencies
- Compiled our project Many of the files that we want to get rid of were used in this First stage, named builder.
Ok, ready? Let's see the Second stage.
- Pull latest
alpine
Docker image, a lightweight linux distro - Create a directory named
/server
and change the context to it - Copy the final binary
server
extracted from the previousbuilder
stage, we do this using thefrom=
key argument - Finally, execute the
server
binary
It was a longer road, but it worths it, let’s see the space used by our images using the same two commands as before, with a couple of modifications.
docker build -t GoLight -f Dockerfile.multi .
docker image ls
And, the output is
REPOSITORY TAG IMAGE ID CREATED SIZE
hello latest 3d0b79dcdd1f 12 minutes ago 109MB
GoServer latest 34e3ad5acd86 21 minutes ago 1.02GB
GoLight latest 309467e3d24e 6 minutes ago 17.5MB
It worked! Now our server image uses only 17.5MB
even less than the hello
image, but why?
Multistage Dockerfiles allows us to extract files and data from specific points on the creation of our images.
Therefore, we can get rid of many unnecessary things and save a lot of disk space without sacrificing functionality.
Don’t believe me? Let’s take a brief look at the execution of a container using each image.
We will use the heavier image first
docker run -p 8080:8080 GoServer:latest
This container then, creates a web server and starts to listen at :8080
, currently our API only has one endpoint, so let’s perform a GET
request to http://127.0.0.1:8080/songs
curl -X GET http://127.0.0.1:8080/songs
And the output should be the next one
[
{
"id": 1,
"name": "Alabaster",
"artist": "Foals",
"album": "Total Life Forever"
},
{
"id": 2,
"name": "Bravery",
"artist": "Human Tetris",
"album": "River Pt. 1"
},
{
"id": 3,
"name": "Lately",
"artist": "Metronomy",
"album": "Metronomy Forever"
},
{
"id": 4,
"name": "Paranoid Android",
"artist": "Radiohead",
"album": "OK Computer"
}
]
Ok! The heavy container is working properly and using 1.02GB
of my disk, now let’s try the same using the lighter image (Remember to stop the current container)
docker run -p 8080:8080 GoLight:latest
This command will create a container but using the lighter image this time, the behavior is the same; therefore, we can perform the same GET
request.
curl -X GET http://127.0.0.1:8080/songs
And the output is the same as before, give it a try!
Conclusion
Docker is a great tool and widely used in the market these days, although it has some disadvantages.
Of course, massive images are still lighter than virtual machines, but we can use Docker in a better way to exploit its full potential.
Multistage Dockerfiles work better with compiled languages, since you can use a stage for build and compile and another to actually execute the binary, yet this is not a limitation.
Please stay tuned for more Docker posts!
Top comments (2)
Nice one 🐳
NextJS Example makes totally sense now!
Glad it's clearer now!