Often, I see terrible Dockerfiles used at length. It has a negative impact on security, productivity, and overall cost.
Today, I want to show you how to improve from a basic to a very efficient Dockerfile, step by step.
Getting Started
Its a very short Dockerfile, which has the advantage of being simple while working perfectly fine. It does the job of:
Using the latest version of the official NodeJS image as a base
Define a working directory (/app)
Copying source files
Installing dependencies
Exposing port (8000)
Running the app (yarn start)
But is it efficient? Obviously not. If it was, there would be no point to the article youre about to read.
Parent Image
Here, there are three important points.
Using a fixed version
Maybe it seems like a good idea to use the latest version of an image as a base (to stay up-to-date), but its not. In practice, its often the source of issues.
If you use the last version of an image (latest), you dont have control over when it gets updated. Its possible (and will happen) that a new version of an image is published while being incompatible with your app.
Between two builds, your application will suddenly go from working to failing, without any apparent reason.
Thats why we prefer using a fixed version instead of latest. But that also means you have the responsibility to update your base images from time to time.
Long-Term-Support
Usually, a software is provided with specific LTS versions. That means creators provide better support for those versions, and should be preferred.
NodeJS provides a list of LTS versions on its official website.
Alpine
Docker images are built on top of a given distribution, such as Ubuntu and Debian. Among those distributions is Alpine.
Its known for being very lightweight compared to others and is a huge help in keeping an efficient image.
Lower Privileges
The default user being ROOT, he has unlimited access. For security, its a better idea to provide a user with limited privileges.
Fortunately, with the node image comes a user called node.
Working Directory
The working directory is the one used by default. Its a good idea to define a specific one for your app.
The most common practice is to use /usr/src/app.
Caching
Docker works with a caching system, which is often ignored. You can think of each step in a Dockerfile as a layer, which is cached.
When one layer changes, all subsequent layers are invalidated. When the image is rebuilt, instead of retrieving a layer from the cache, the necessary command is simply restarted.
One layer changes frequently: the source code. So its best to copy the source code as late as possible.
The most common mistake is to copy the list of dependencies at the same time as the source code, then install the dependencies.
In this case, every time the source code changes and the image is rebuilt, the dependencies will be reinstalled. Thats why we prefer to copy the files defining the dependencies first, then install them, and finally copy the source code.
Configure the Working Directory
Previously, we defined a new Working Directory and set up a user with limited privilege. That means we need to create the necessary directory first and give necessary access to the node user.
Update Package List
With Alpine, we arent using APT, but APK to manage packages. No matter which tool you use to manage packages, you need to update its list from remote repositories.
That way, you ensure no out-of-date packages are installed. Its mandatory to avoid packages with security and performance issues.
Adding Packages
I recommend using theno-cache flag to avoid generating cache you wont use, but will still make your image heavier.
Multiple RUN
Inside Best practices for writing Dockerfiles, we learn to:
always combine
RUN apt-get update
withapt-get install
in the sameRUN
statement.Using
apt-get update
alone in aRUN
statement causes caching issues and subsequentapt-get install
instructions to fail.
For more information, I highly recommend you give best practices a read.
Entrypoint
I like to keep a basic Dockerfile that doesnt start my application directly. Instead, I use Tini, which keeps the container alive and improve how processes are managed.
Then, you can either use multi-stage builds to augment your basic stage, or start and configure your application from outside the Dockerfile.
Locally you can use the Docker CLI. Also, when you deploy your app, any container management system allows you to configure startup scripts.
That way, you can have a single Dockerfile that is used in different configurations, instead of multiple configurations with barely any difference.
It becomes the source of truth about which environment your app run on.
Final Result
Now, you might be wondering: whats the difference?
Apart from better security and faster re-build, one of my real-world project is 1GB lighter with the improved image:
Do you want to learn more backend skills, you can effectively use in a professional environment?
Cover photo by Thais Morais on Unsplash
Top comments (0)