TechWorld with Nana

Posted on Nov 21, 2021 • Edited on May 1, 2022

Top 8 Docker Best Practices for using Docker in Production ✅

#beginners #devops #docker #tutorial

Docker adoption rises constantly 📈 and many are familiar with it, but not everyone is using Docker according to the best practices. 👀

Before moving on, if you don't know what Docker is, you can learn everything you need to get started in this free Docker Crash Course 🐳

Why using Best Practices? 🤷‍♀️

So, in my new video 'Top 8 Docker Production Best Practices' I want to show you 8 ways you can use Docker in a right way in your projects to:

✅ improve security,
✅ optimize the image size,
✅ take advantage of some of the useful Docker features
✅ and also write cleaner and more maintainable Dockerfiles

1️⃣ Best Practice

Use an official and verified Docker image as a base image, whenever available.

Let's say you are developing a Node.js application and want to build and run it as a Docker image.

Instead of taking a base operating system image and installing node.js, npm and whatever other tools you need for your application, use the official node image for your application.

Improvements:

Cleaner Dockerfile
Official and verified image, which is already built with the best practices

2️⃣ Best Practice

Use specific Docker image versions

Okay, so we have selected the base image, but now when we build our applications image from this Dockerfile, it will always use the latest tag of the node image.

Now why is this a problem? 🤔
❌ - you might get a different image version as in the previous build
❌ - the new image version may break stuff
❌ - latest tag is unpredictable, causing unexpected behavior

So instead of a random latest image tag, you want to fixate the version and just like you deploy your own application with a specific version you want to use the official image with a specific version.
And the rule here is: the more specific the better

Improvements:

Transparency to know exactly what version of the base image you're using

3️⃣ Best Practice

Use Small-Sized Official Images

When choosing a Node.js image, you will see there are actually multiple official images. Not only with different version numbers, but also with different operating system distributions:

So the question is: Which one do you choose and why is it important? 🤷🏻‍♂️

1) Image Size
❌ Well, if the image is based on a full-blown OS distribution like Ubuntu or Centos, you will have a bunch of tools already packaged in the image. So the image size will be larger, but you don't need most of these tools in your application images.

✅ In contrast having smaller images means you need less storage space in image repository as well as on a deployment server and of course you can transfer the images faster when pulling or pushing them from the repository.

2) Security Issue
❌ In addition to that, with lots of tools installed inside, you need to consider the security aspect. Because such base images usually contain hundreds of known vulnerabilities and basically create a larger attack surface to your application image.

This way you basically end up introducing unnecessary security issues from the beginning to your image! 🙉

✅ In comparison by using smaller images with leaner OS distributions, which only bundle the necessary system
tools and libraries, you're also minimizing the attack surface and making sure that you build more secure images.

So the best practice here would be to select an image with a specific version based on a leaner OS distribution like alpine for example:

Alpine has everything you need to start your application in a container, but is much more lightweight. And for most of the images that you look on a Docker Hub, you will see a version tag with alpine distribution inside.

It is one of the most common and popular base images for Docker containers.

4️⃣ Best Practice

Optimize caching for image layers when building an image

So what are image layers and what does caching and image layer mean? 🤔

1) What are Image Layers?
A Docker image is built based on a Dockerfile.
And in a Dockerfile each command or instruction creates an image layer:

So when we use a base image of node alpine like in the above example it already has layers, because it was already built using its own Dockerfile. Plus, in our Dockerfile on top of that we have a couple of other commands that each will add a new layer to this image.

2) Now what about caching?
Each layer will get cached by Docker. 👍
So when you rebuild your image, if your Dockerfile hasn't changed, Docker will just use the cached layers to build the image.

Advantages of cached image layers:
✅ - Faster image building
✅ - Faster pulling and pushing of new image versions:
If I pull a new image version of the same application and let's say 2 new layers have been added in the new version: Only the newly added layers will be downloaded, the rest are already locally cached by Docker.

3) Optimize the Caching
So to optimize the caching, you need to know that:
Once a layer changes, all following or downstream layers have to be re-created as well. In other words: when you change the contents of one line in the Dockerfile, caches of all the following lines or layers will be busted and invalidated. 😣

So the rule here and the best practice is:
Order your commands in the Dockerfile from the least to the most frequently changing commands to take advantage of caching and this way optimize how fast the image gets built. 🚀

5️⃣ Best Practice

Use .dockerignore file

Now usually when we build the image, we don't need everything we have in the project to run the application inside. We
don't need the auto-generated folders, like targets or build folder, we don't need the readme file etc.

So how do we exclude such content from ending up in our application image? 🤔
👉 Using a .dockerignore file.

It's pretty straightforward. We basically just create this .dockerignore file and list all the files and folders that we want to be ignored and when building the image, Docker will look at the contents and ignore anything specified inside.

Improvements:

Reduced image size

6️⃣ Best Practice

Make use of Multi-Stage Builds

But now let's say there are some contents (like development, testing tools and libraries) in your project that you NEED for building the image - so during the
build process - but you DON'T NEED them in the final image itself to run the application.

If you keep these artifacts in your final image even though they're absolutely unnecessary for running the application, it will again result in an increased image size and increased attack surface. 🧐

So how do we separate the build stage from the runtime stage.
In other words, how do we exclude the build dependencies from the image, while still having them available while building the image? 🤷‍♀️

Well, for that you can use what's called multi-stage builds 💡

The multi-stage builds feature allows you to use multiple temporary images during the build process, but keep only
the latest image as the final artifact:

So these previous steps (marked "1st" in the above picture) will be discarded.

Improvements:

Separation of Build Tools and Dependencies from what's needed for runtime
Less dependencies and reduced image size

7️⃣ Best Practice

Use the Least Privileged User

Now, when we create this image and eventually run it as a container, which operating system user will be used to start the application inside? 🤔
By default, when a Dockerfile does not specify a user, it uses a root user. 🙉 But in reality there is mostly no reason to run containers with root privileges.

❌ This basically introduces a security issue, because when container starts on the host it, will potentially have root access on the Docker host.
So running an application inside the container with a root user will make it easier for an attacker to escalate privileges on the host and basically get hold of the underlying host and its processes, not only the container itself 🤯 Especially if the application inside the container is vulnerable to exploitation.

✅ To avoid this, the best practice is to simply create a dedicated user and a dedicated group in the Docker image to run the application and also run the application inside the container with that user:

You can use a directive called USER with the username and then start the application conveniently.

Tip: Some images already have a generic user bundled in, which you can use. So you don't have to create a new one. For example the node.js image already bundles a generic user called node, which you can simply use to run the application inside the container. 👍

8️⃣ Best Practice

Scan your Images for Security Vulnerabilities

Finally, how do you make sure and validate the image you build has a few or no security vulnerabilities? 🧐

So my final best practice is, once you build the image to scan it for security vulnerabilities using the docker scan command. 🔍

In the background Docker actually uses a service called snyk to do the vulnerability scanning of the images. The scan uses a database of vulnerabilities, which gets constantly updated.

Example output of docker scan command:

You see:
1) the type of vulnerability,
2) a URL for more information
3) but also what's very useful and interesting you see which version of the relevant library actually fixes that vulnerability. So you can update your libraries to get rid of these issues. 👍

Automate the scanning 🚀
In addition to scanning your images manually with docker scan command on a CLI, you can also configure Docker Hub to scan the images automatically, when they get pushed to the repository. And of course you can integrate this check in your CI/CD pipeline when building your Docker images.

So these are 8 production best practices that you can apply today to make your Docker images leaner and more secure! 🚀😊 Hope it is helpful for some of you! Of course there are many more best practices related to Docker, but I think applying these will already give you great results when using Docker in production.

Do you know some other best practices, which you think are
super important and have to be mentioned?
Please share them in the comments for others 🙌 👍

The full video is available here: 🤓

Like, share and follow me 😍 for more content:

Top comments (12)

Andrei Dascalu • Nov 21 '21

Just an observation.
While latest tag is definitely a bad practice, that doesn't make fixed versions a "best practice". It can be a decent practice and a good rule of thumb but consider this:
The vast majority of docker image offerings for software that follows semver also versions their docker images accordingly. This means there will be rolling tags for major versions that will contain updated minor versions and patches as well as rolling tags for minor versions that contain updated patch versions only.
It can be a good idea to allow your build to at least follow the newest minor versions rolling tag since patches bring goodies like security updates in a non-breaking way.

Muayyad Alsadi • Nov 22 '21

there is a good point about layers, another example is to clear package manager cache after you finish in one layer that is:

FROM registry.fedoraproject.org/fedora-minimal:35
RUN \
  microdnf module enable -y nodejs:14 && \
  microdnf -y install nodejs zopfli findutils busybox && \
  microdnf clean all

because if you add file in a layer and remove it in another layer it would still count and carried in the archive, it would be just carried with a flag that it's removed.

regarding: Use specific Docker image versions
pinning the exact version is a security risk, one might pin only the major version allowing it to receive security updates so instead of node:17.0.1 just node:17, it's less likely to break the application depending on 17-specific features, it it would be able to receive security fixes from 17.0.2.

Use .dockerignore file

even better, use buildah (podman build) which does not need to archive and create and send the archive to the docker daemon.

another workaround, create a directory called containers and put the docker file inside it, where only the needed files are inside that directory.

Make use of Multi-Stage Builds

this is very important, as someone who was part of that proposal, I'm very sad this feature is rarely used.

The compiler, git, intermediate files, ...etc should never be part of final image.

Apostol Faliagas • Nov 22 '21

"by using smaller images with leaner OS distributions, which only bundle the necessary system tools and libraries, you're also minimizing the attack surface and making sure that you build more secure images"
Are you serious?

Roy Ben-Yosef • Nov 26 '21

Yes, that's accurate. Having less tools means you are less exposed to vulnerabilities. That is a perfect example of smaller attack surface.

Roy Ben-Yosef • Nov 26 '21

For example, let's say you followed the advice for not using root. But now you have something bad in your container that wants to gain local privilege escalation (e.g. through some vulnerability take over the host). Oftentimes, local tools may contain vulnerabilities that can allow such lpe.
So yes, I think that is a really good example for reducing attack surface.

AlgoT • Nov 22 '21

If you have a criticism at least provide a constructive explanation on what you see as a misunderstanding.

Muayyad Alsadi • Nov 22 '21

Thank you for this useful information. But many of those points are not best practices, their are just you picked docker as your favorite vendor and orange is your favorite color. Official means different things to different people, in your article official means docker's official and having docker as your favorite vendor.
If I want centos, I would use quay.io/centos/centos:stream8 that's what official for me. If I want mysql and bitnami is my favorite vendor then official means to me docker.io/bitnami/mysql:8.0
Vendoring is about picking a vendor, to whom you open tickets or have a phone call to.

More over, I believe the docket's official images are very bad, when docker used to hire the late Ian Murdock (father of debian), debian and ubuntu was the official base image, when he passed a way and they hired the father of alpine, alpine is now the official base image.

It's even worse, they are the same people who have 80% of their official images vulnerable and they have forgotten the root password of alpine empty and wide open.

I had many cases were "npm install" break on alpine.

As a fedora contributor, I prefer a minimal fedora image from "registry.fedoraproject.org/fedora-minimal:35" and microdnf the exact node version I want from the dnf module of the version I want (microdnf module enable -y nodejs:14). I would trust that. and If I want ubuntu base image, I would trust NodeSource as node vendor in my docker file. I would recommend against the official docker hub "node" (I don't trust them)

In summary your preferred vendor is not a best practice. Pick the vendor you trust and the ones that you are comfortable to file tickets to and got them solved.

I'll post other points in different comment.

Cody Antonio Gagnon • Dec 7 '21

Very happy to have learned about these best practices, Nana! I was able to implement all of them and learned a lot along the way!

One thing I would recommend that I also learned about (plus some other great tips) was that you can ensure the correct Docker image is downloaded by utilizing the SHA-256 hashing feature as outlined in this article

Shivashankar Sukumar • Feb 2 '22 • Edited

Hi Nana...I follow your devops, docker videos. I was trying this example "https://www.youtube.com/watch?v=6YisG2GcXaw&list=PLy7NrYWoggjzfAHlUusx2wuDwfCrmJYcs&index=8" in AWS-EC2 ubuntu machine. I had set up and linked mongodb with mongo express. I even started the server.js using npm install command and it is listening in port 3000. When I tried to access the publicip:3000, the page is not loading and displays blank page. I attached the image. Please help me out.