This is my note from this this talk on Dockerfile good practices
Areas of improvements
- Incremental build time
- Image size
- Maintainability
- Security
- Consistency / Repeatability
Incremental build time
- Order is important: if you make changes to any line or any stage of your dockerfile then subsequent stages cache will be busted. So, order your steps from least to most frequently changing steps to optimize caching.
Let's look at this sample dockerfile
FROM ubuntu:18.04
COPY . /app
RUN apt-get update
RUN apt-get install openjdk-8-jdk
This dockerfile copies the application files immediately after the start. Because caching is based on previous steps, whenever something changes in that content, all the steps after the copy has to be invalidated. So, the cache will have to be busted and the steps below should run again.
This is a problem here, since if you want to change your application code, and want to build your image reflecting the latest change, you'll have run all the commands below the COPY command.
A better way here, would be to run the COPY
command at the last.
FROM ubuntu:18.04
RUN apt-get update
RUN apt-get install openjdk-8-jdk
COPY . /app
Only copy what's needed Avoid
COPY .
if possible. When you are copying files into your image make sure you're very specific as to what you want to copy, because any changes to files you're copying will bust the cacheIdentify cacheable units Sometimes you want things to be cached together. Identify cacheable units. For example change this
RUN apt-get update
RUN apt-get -y install openjdk-8-jdk
to
RUN apt-get update \
&& apt-get -y install \
openjdk-8-jdk
This prevents using an outdated package cache.
- Fetch dependencies in a separate step: This is also about identifying the cacheable units.
Reduce Image size
Remove unnecessary dependencies: Don't install debugging tools and other unnecessary dependencies. You can also use the
--no-install-recommends
flag. You don't want to deploy your build tools into production, as you will not need them at runtime.Remove package manager cache: You don't need the cache after installing the packages. It's good to remove them as well
RUN apt-get update \
&& apt-get -y install --no-install-recommends \
openjdk-8-jdk \
&& rm -rf /var/lib/apt/lists/*
Maintainability
Use official images where possible: Official images are pre-configured for container use and built by smart people. It can save you a lot of time in maintenance. This also allows you to share layers between images, as they use exactly the same base image.
For the above sample Dockerfile instead of using debain and installing the dependencies simply start your base image fromFROM openjdk:8
Use more specific tags: The
latest
tag is a rolling tag. Be specific to prevent unexpected changes in your base image.Look for minimal flavors Maybe you don't need all the things that is in the bigger variants.
REPOSITORY TAG SIZE
-----------------------------------------
openjdk 8 624MB
openjdk 8-jre 443MB
openjdk 8-jre-slim 204MB
opendjk 8-jre-alpine 83MB
Reproducible
The dockefile as a blueprint of your image, source code the source of truth for your application
Make the dockerfile your blueprint
- It describes the build environment
- Correct versions of build tools installed
- Prevent inconsistencies between environments
- There may be system dependencies
Multi-stage builds
Use Cases
- Separate build from runtime environment
- Slight variations on images (DRY)
-
Build dev/test/lint/ specific environments
- builder: all build dependencies
- build: builder + build artifacts
- cross: same as build but for different envs
- dev: builder + dev/debug tools
- lint: minimal lint dependencies
- test: all test dependencies + build artifacts to be tested
- release: final minimal image with build artifacts
Delinearizing your dependencies (concurrency)
Platform specific stages
When you name a stage you can only build that stage.
FROM image_or_stage AS stage_name
$ docker build --target stage_name
Top comments (3)
Good article - thank you.
Am a bit focussed on reducing image sizes as generally a very good thing.
The image size may have only a limited effect on the RAM usage foot print of a running container. Doubtless Alpine images are smaller and build faster and load faster but do they actually cost less to host in the cloud?
Am partly looking to see why Ubuntu and Debian based remain so popular relative to Alpine?
Hey Alastair, thanks for the comment. I have no idea about the costs.
Ubuntu and Debian may be preferable due to several reasons:
Also these thread on might be relevant:
Hey, Thank you for taking a moment to respond.
Familiarity seems a reason for inertia; when actually in most cases the effort to transition is a modest one off cost. If the Alpine RAM footprint is really half the size (of Ubuntu/ Debian) then the reduction in deployment hosting costs is recurring and therefore significant - especially for small startups.
There is also a relationship between executable size and CPU cache size that influences performance quite markedly.
Concerning security, everyone should always be "all ears" and the picture evolves - just ask OpenBSD about their SSL hick-up. Also given that your links are between 3 and 5 years old, it would be interesting to know how this has evolved.
Currently seeing little reason to switch focus away from Alpine; and if the moment comes, my familiarity with Ubuntu and Debian is still alive on my desktop.
Thanks again for your response.