Learn how to significantly reduce the size of your Docker images with practical, step-by-step techniques. We'll take a simple app and apply optimizations like multi-stage builds, choosing the right base image, and more.
Introduction
As developers, we love using Docker for its portability and consistency. However, it's easy to end up with large, bloated Docker images. Why does image size matter?
- Faster CI/CD Pipelines: Smaller images are quicker to build, push, and pull, speeding up your development and deployment cycles.
- Lower Costs: Reduced image size means less storage space used in your container registry and on your servers.
- Improved Security: A smaller image has a smaller attack surface, with fewer packages and libraries for vulnerabilities to hide in.
In this guide, we'll walk through a hands-on lab to demonstrate how to slash your Docker image sizes. We'll start with simple "Hello World" applications in Python and Node.js and apply a series of powerful optimization techniques.
Our Sample Applications
For this lab, we'll use two basic applications.
Python (app.py):
print("Hello, from Python!")
With an empty requirements.txt:
# No dependencies for this simple app
Node.js (app.js):
console.log("Hello, from Node.js!");
And a standard package.json:
{
"name": "docker-opt-lab",
"version": "1.0.0",
"description": "",
"main": "app.js",
"scripts": {
"start": "node app.js"
},
"author": "",
"license": "ISC"
}
The Starting Point: The Unoptimized Image
Let's start with a basic, unoptimized Dockerfile for each application, using a standard Debian-based image.
Python (Dockerfile.python-debian):
FROM python:3.9-slim-buster
WORKDIR /app
COPY . .
CMD ["python", "app.py"]
Node.js (Dockerfile.node-debian):
FROM node:16-buster
WORKDIR /app
COPY . .
CMD ["node", "app.js"]
After building these, we get our baseline sizes. For example, the python-debian image might be around 115MB, and the node-debian image could be about 940MB. Let's see how we can shrink these.
Technique 1: Choose Your Base Image Wisely (Alpine vs. Debian)
One of the easiest wins in image optimization is choosing a smaller base image. The alpine variants are significantly smaller than their Debian-based counterparts because they're based on Alpine Linux, a minimal Linux distribution.
Let's switch our base images to alpine.
Python (Dockerfile.python-alpine):
FROM python:3.9-alpine
WORKDIR /app
COPY . .
CMD ["python", "app.py"]
Node.js (Dockerfile.node-alpine):
FROM node:16-alpine
WORKDIR /app
COPY . .
CMD ["node", "app.js"]
Just by switching to alpine, the Python image might drop to around 48MB and the Node.js image to 115MB. That's a huge reduction!
Note: While alpine images are small, they use musl libc instead of glibc. This can sometimes lead to compatibility issues with certain packages, so always test your application thoroughly.
Technique 2: Slim Down with Multi-Stage Builds
For applications that have a build step (like installing dependencies or compiling code), multi-stage builds are a game-changer. The idea is to use one stage (the "builder") to perform all the build-related tasks and then copy only the necessary artifacts into a clean, lightweight final stage.
Let's see this with our Node.js application.
Node.js (Dockerfile.node-multistage):
# --- Build Stage ---
FROM node:16-buster AS builder
WORKDIR /app
COPY package.json .
RUN npm install --production
# --- Production Stage ---
FROM node:16-slim
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY app.js .
CMD ["node", "app.js"]
How does this work?
- The
builderstage uses the fullnode:16-busterimage, which includesnpmand all the tools needed to install dependencies. - We copy
package.jsonand runnpm install. - The production stage starts from a fresh, slim base image (
node:16-slim). - We use
COPY --from=builderto copy only thenode_modulesfolder from the builder stage into our final image. We also copy our application code.
The final image contains only the compiled node_modules and app.js, without any of the development tooling. This can bring our Node.js image size down even further.
Technique 3: Don't Forget the .dockerignore File
The .dockerignore file works just like .gitignore. It lets you specify files and directories that should be excluded from the Docker build context. This is important to prevent sensitive files or unnecessary clutter from ending up in your image.
A good .dockerignore file might look like this:
.git
.gitignore
Dockerfile*
*.md
README.md
This ensures that your Git history, .md files, and the Dockerfiles themselves aren't copied into the image, keeping it clean and lean.
Technique 4: Layering and Cleanup
Each RUN instruction in a Dockerfile creates a new layer. You can reduce the number of layers by chaining commands together with &&. This also allows you to clean up temporary files in the same layer they were created in.
For example, instead of this:
RUN apt-get update
RUN apt-get install -y curl
Do this:
RUN apt-get update && \
apt-get install -y curl && \
rm -rf /var/lib/apt/lists/*
By running rm -rf /var/lib/apt/lists/*, we clean up the apt cache in the same RUN command, which reduces the final image size.
Conclusion
Let's look at the potential impact of these optimizations.
- Unoptimized (Debian): ~115MB (Python), ~940MB (Node.js)
- Alpine: ~48MB (Python), ~115MB (Node.js)
- Multi-stage (Node.js): Can be even smaller than the Alpine image, depending on dependencies.
- With
.dockerignoreand cleanup: Every little bit helps!
By applying these simple but effective techniques, you can drastically reduce the size of your Docker images. This leads to a more efficient, secure, and cost-effective development process.
Happy optimizing!
Top comments (0)