Introduction
A few months ago, while working on a critical deployment for a client, we faced an unexpected issue: the deployment took forever to complete. The culprit? Bloated Docker images. The process was not only frustrating but also led to downtime we couldn’t afford.
This experience taught me an important lesson: small changes can make a big impact. By optimizing Docker images, we managed to cut deployment times in half, save storage costs, and improve our CI/CD pipeline's overall efficiency. Today, I’ll share the strategies we used to achieve this transformation.
Why Optimize Docker Images?
If you've ever experienced sluggish builds, long deployment times, or a cluttered registry filled with oversized images, you’re not alone. Here’s why reducing image sizes is crucial:
- Faster Builds: Your development cycles become quicker, letting you focus on what matters.
- Efficient Storage: Smaller images save disk space in your Docker registries and on your machines.
- Quicker Deployments: Deploying a smaller image over a network is much faster.
- Enhanced Security: Fewer components mean fewer vulnerabilities.
The Day We Shrunk Our Docker Images
I remember the first time I ran docker images
after our optimization efforts. Seeing the "before" and "after" sizes felt like stepping on the weighing scale after weeks of gym sessions—you notice the difference, and it feels rewarding.
Here are the exact steps we followed to make that transformation happen:
7 Effective Ways to Optimize Docker Images
1. Choose a Minimal Base Image
Instead of starting with ubuntu:latest
or other large images, we switched to alpine
. This one change reduced the image size from 800MB to less than 30MB.
Example:
FROM alpine:latest
2. Use Multi-Stage Builds
In many projects, such as a React application, we might have build dependencies (like Node.js and npm) that are only required during the build process but not needed in the production image. By using multi-stage builds, we can separate the build environment from the runtime environment, resulting in a much smaller image.
Example:
In this example, we’ll use a multi-stage build for a React app:
# Build Stage
FROM node:16 AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
RUN npm run build
# Runtime Stage
FROM nginx:alpine
COPY --from=builder /app/build /usr/share/nginx/html
CMD ["nginx", "-g", "daemon off;"]
In the above Dockerfile
The first stage uses the official node:16 image to install dependencies, build the React app, and generate static files.
The second stage uses the smaller nginx:alpine image to serve the built React app.
This multi-stage approach ensures that only the necessary build artifacts (the build directory) are included in the final image, keeping the image size minimal and optimized for production.
3. Remove Unnecessary Files
While debugging, we often included temporary files in our builds. By adding a .dockerignore
file, we ensured these files never made it into the image.
Example .dockerignore:
node_modules
*.log
.git
4. Combine and Minimize Layers
Each instruction in a Dockerfile
(e.g., RUN
, COPY
, ADD
) creates a new layer in the Docker image. Too many layers can bloat your image size. By combining multiple instructions into a single RUN
statement, you can reduce the number of layers and optimize the image.
Example:
Instead of writing:
RUN apt-get update
RUN apt-get install -y curl vim
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
Combine them into one:
RUN apt-get update && apt-get install -y curl vim \
&& apt-get clean && rm -rf /var/lib/apt/lists/*
This approach minimizes the number of layers and ensures temporary files (e.g., cache) are removed within the same layer, keeping the image smaller and cleaner.
5. Avoid Installing Unnecessary Dependencies
Initially, our Docker images had extra libraries "just in case." Over time, we realized that this led to bloated images and unnecessary security risks. By specifying only the dependencies that are actually needed for runtime, we kept the image smaller and more secure.
For example, instead of installing a large number of libraries for every project, we focused on minimal dependencies and avoided unnecessary packages.
6. Use docker-slim
A game-changer for our process was docker-slim
. This tool automatically analyzes your images and reduces their size by removing unnecessary parts, such as unused files, binaries, and libraries, without affecting functionality.
We saw an image size reduction of up to 80% using docker-slim
, making it an invaluable tool in our optimization strategy.
Command to slim down an image:
docker-slim build <image-name>
7. Regularly Audit and Prune Images
Docker images accumulate over time, and unused images or layers can take up valuable space. Regularly auditing and pruning unused images helps maintain a clean environment.
You can remove unused images and layers by running these commands:
Command to prune unused images:
docker system prune -f
Command to remove all unused images:
docker image prune -a -f
By incorporating regular pruning into your workflow, you ensure that your Docker environment stays lean and efficient.
Measuring Success
After implementing these optimizations, we used docker images
to compare sizes. The results were stunning:
- Before Optimization: 1.2GB
- After Optimization: 250MB
Not only did our deployments become faster, but our cloud storage costs also went down significantly.
Conclusion
Optimizing Docker images might seem like a minor task, but the benefits it brings to your workflows are immense. Whether you’re a solo developer or part of a large team, these strategies can make a real difference.
So, what are you waiting for? Dive into your Dockerfile
, start optimizing, and enjoy the perks of leaner, faster deployments.
References
📖 Love this blog? You can also read my articles on Hashnode and follow me there for more tech insights! 🚀
Top comments (25)
You said moving from
ubuntu:latest
toalpine:latest
"reduced the image size from 800MB to less than 30MB". The current version of Ubuntu (shafec8bfd95b54
) is only 78 MB, and Alpine is 7.8 MB. I think there would have had to be other changes to have such a reduction.In the forth point builtkit supports heredocs. Rather than endless
&& \
statements this would be more readable, especially for more complex runs.I'd suggest one to use a somewhat bigger base image in build phase, then use alpine or slim in application phase and move the deps to it. Cuz afaik and have had experience, directly using a small image at the beginning may cause some installation of deps to fail.
Thank you for sharing your insight! You’re absolutely right—starting with a slightly larger base image in the build phase can help avoid issues with dependency installations, especially when using minimal images like Alpine. Transitioning to a smaller image (like Alpine or a slim variant) in the runtime phase is indeed a smart way to balance compatibility and optimization.
For Node.js projects specifically, using the node image directly in the build phase is a great option since it comes pre-configured for most setups. Similarly, other tech stacks might benefit from base images tailored to their needs during the build phase.
Using not alpine for dependency installation can install incompatible lib version for alpine.
Alpine use musl libc vs glibc for other base.
Another point, bumping your package.json version will invalidate the docker cache layer.
There is some technic to avoid this, like, using a temporary step setting the package.json to version 1.0 for instance before install.
Thank you for diving into these technical details—great points!
You’re absolutely correct about the musl libc vs. glibc difference. Alpine’s musl libc can indeed cause compatibility issues with some libraries, which is why choosing the right base image for dependency installation is crucial. This is also why multi-stage builds work so well—dependencies can be built in a compatible environment (glibc-based) and then moved to a smaller runtime image like Alpine if desired.
The point about package.json is spot on as well. When the version changes, it invalidates the Docker cache, which can significantly increase build times. I really like the technique you mentioned, where a temporary package.json with a static version is used during installation—it’s a clever way to maintain cache efficiency.
Good article.
Will try to implement some of these techniques at my workplace.
Thank you! Glad you found it helpful—let me know how it works out for you! 🚀
FROM alpine:latest - you should specify the exact version of the alpine image if you don;t wont to have a problems in the future.
You’re absolutely right! Using alpine:latest can lead to unexpected issues if the base image changes, so specifying an exact version is a best practice for production.
However, in multi-stage builds, using latest for the build stage can sometimes be beneficial to leverage the latest tools and updates.
so it depends :)
Multi-stage builds are also awesome for optimizing Docker stuff: docs.docker.com/build/building/mul...
Useful. Thanks.
Article does a great job in listing the best practices while building docker images. Article is neat and precise. Keep up the great work.
Thank you so much for the encouraging words! 😊 I'm glad you found the article helpful. Stay tuned for more content like this!
Very helpful my trainer didn't teach me about these he skipped whatever thanks bro
Glad you found it helpful! Sometimes trainers skip over these finer details, but exploring them yourself makes the learning even more rewarding. Happy optimizing, bro! 🚀
Using cache can also speed up your build process.
Great point! I’ve actually written a blog on using Docker layer caching to speed up builds—check it out here: link 😊
Some comments may only be visible to logged-in visitors. Sign in to view all comments. Some comments have been hidden by the post's author - find out more