Mateusz Cholewka

Posted on Nov 13, 2021 • Edited on Apr 4, 2023

Here are the Dockerfile tips you can apply to get your builds faster and safer

#devops #docker #php #javascript

Nowadays we are using docker a lot in web development. It's easy to use, great in scaling, and gives us an immutable environment for running your application from local development to deploy on production.
To get the best experience with docker you should apply some practices to get fast and light builds of your docker images.

In this article, I want to show you some of those practices based on this example:

FROM php:7-fpm
WORKDIR /app

COPY . .

ADD https://deb.nodesource.com/setup_12.x .
RUN bash setup_12.x

RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin/ --filename=composer

RUN apt update && \
    apt install -y \
    curl \
    git \
    htop \
    libicu-dev \
    libgd-dev \
    mariadb-client \
    libonig-dev \
    vim \
    unzip \
    nodejs

RUN apt purge -y --auto-remove
RUN npm install -g yarn

RUN docker-php-ext-install \
    exif \
    gd \
    intl \
    mbstring \
    mysqli \
    opcache \
    pdo_mysql \
    sockets

ENV COMPOSER_ALLOW_SUPERUSER 1
RUN composer install

RUN yarn install

RUN yarn run build

Base your builds on specific image version

The first thing to change is the base image tag. As you can see in this Dockerfile the PHP7 is used, but the tag name is not precise enough. Here is the first improvement that we can make.

When you are using dependencies managers like yarn / composer, you probably use the lock files. Using them will keep exactly the same version of dependencies on every install. So why don't do it with all dependencies?

So the first dependency is the image tag we base our image on.

FROM php:7-fpm
...

We can change it to:

FROM php:7.4.25-fpm
...

That should save you for situations where your image doesn't work after a few months because of differences in newer PHP versions.

COPY your code last

Docker images are built from layers. Every layer can be cached, and this cache can be reused for the next builds if nothing has been changed. Docker can use cache only if all of the previous layers are loaded from cache too.

...
COPY . /app/
...

You should order your build steps by frequency of changes. Your application code is probably the thing that is changing most often, so you should put it as late as possible.

FROM php:7.4.25-fpm
WORKDIR /app
## remove COPY from here
...
## rest of commands
...
COPY . .
## final commands

Do not use ADD for remote dependencies

ADD instruction in Dockerfile allows you to copy files from remote locations by URLs. This feature also can unpack the zip archives which is great, but it has one problem. It doesn't cache your files.

ADD https://deb.nodesource.com/setup_12.x ./node_setup.bash
RUN bash node_setup.bash && \
    rm node_setup.bash

Ok, that's better.

The setup script file is undesirable, so it can be removed after the installation. But the problem is that the layers in Dockerfiles works like commits in git. When you put something to the repository using commit you can delete it with the next one, but because git works incrementally, both versions are kept in history, and the repository size increases.
To avoid this in docker images, you should create and remove undesirable files in the same instruction.

RUN curl -sS https://deb.nodesource.com/setup_12.x ./node_setup.bash && \
    bash node_setup.bash && \
    rm node_setup.bash

Better, but still not the best.

RUN curl -sS https://deb.nodesource.com/setup_12.x ./node_setup.bash | bash -

You can do all that things in one-line command using a pipe. In this example, the file content will be fetched and pushed directly to the bash that will execute it.

Using composer in Dockerfile

Here we have the composer installed in our container. It will be kept for all environments. It's not the best idea to keep it in final image, because it's not necessery and may add some vulnerabilities. There is a better option to use composer with multistage build that I want to describe in the next article.

...
RUN curl -sS https://getcomposer.org/installer | php -- --install-dir=/usr/local/bin/ --filename=composer
...

This line is ok it will be cached, and do not leave any garbage.
Maybe we should use the hash checking script that you can find in the official install script.
You can also use this trick:

...
COPY --from=composer:2.1.11 /usr/bin/composer /usr/bin/composer
...

That will copy the composer bin from the external official composer image.

Installing apt packages

Next, we have some packages installed using apt manager. Let's check if all of them are needed.

The git may be required for pulling packages or building some binaries from source. I can't see any reason to keep it. Let's remove it for now.

The htop may be useful for debugging, but not for the final image, we can install it when we really need it. Vim is useless too because you shouldn't make any changes in the working container. It's stateless, so your changes will disappear on a restart. Also mariadb-client is probably required only for development.

The rest of the packages may be required, but there is one more problem. The docker is using layers for caching. Every layer is built from dingle instruction. The cache is invalidated if the instruction or previous instruction had changed. So in this case if you do not change this instruction, the newer packages could be never installed, and they may vary depends on build environment.

If you add a specific version of every package, you will be sure that every image built from this Dockerfile has the same versions of packages, and the cache will be invalidated correctly.

You can do this by specifying the version after the = sign. To check which version you need to install, go to your current working container, or to the container that you build your image from, and check it with a list command:

$ apt list libonig-dev
Listing... Done
libonig-dev/stable,now 6.9.6-1.1 amd64 [installed]

In this example the currently working version is 5.5.9999+default, so let's check the rest and specify them.

RUN apt update && \
    apt install -y \
    libicu-dev=67.1-7 \
    libgd-dev=2.3.0-2 \
    libonig-dev=6.9.6-1.1 \
    unzip=6.0-26 \
    nodejs=12.22.7-deb-1nodesource1

RUN apt purge -y --auto-remove

Of course, you need to keep them up to date manually. It's good to check them frequently.

There is one more thing to do. After the install command, there is a commend that's cleaning your system after installing instruction. It's very good that that is here, but this is done in the separated instruction. As we remember, if we remove something on another layer, that will still exist in the previous layers of our final image. So let's do the cleaning in the same command. That should decrease your final image size.

RUN apt update && \
    apt install -y \
    libicu-dev=67.1-7 \
    libgd-dev=2.3.0-2 \
    libonig-dev=6.9.6-1.1 \
    unzip=6.0-26 \
    nodejs=12.22.7-deb-1nodesource1 && \
    apt purge -y --auto-remove

Composer dependencies

Let's get to the next lines. There is another one RUN instruction, that will install all of our composer dependencies. The first thing that is missed here is that we install all dependencies also with dev dependencies, that's are not necessary for the running environment. So let's put some flags here.

RUN composer install --optimize-autoloader --no-dev

Those flags will install all dependencies exclude dev, with autoloader optimization.

As you remember, we have to move the COPY instruction of our code from the beginning of this file as much as possible at the end. Here is the line where we need our project files. But do we need our entire codebase? How often do you change the dependencies in your project? For sure less often than your application code. So do we need to pull our dependencies every time when we change something in our code? Probably no 😃

So the only files that we need are the composer files there.

COPY composer.json .
COPY composer.lock .
RUN composer install --no-dev --no-scripts

Now the cache will work for our composer depenencies.

The code

Ok, it's time when we need our code because there are the build steps. Let's paste our COPY instruction from the beginning here.

COPY . .

And now, we need to generate the autoloader file with all our project files

RUN composer dumpautoload --optimize

Node dependencies

For a node there is the same situation as in composer. So first copy packages files and next install all dependencies.

RUN yarn install

RUN yarn run build

Do we need all dependencies or only non-dev dependencies? Maybe we don't need any node dependencies in the container because we use it only to build our frontend. So why not install everything and remove it after the build?

RUN yarn install && \
    yarn run build && \
    rm -rf node_modules && \
    yarn cache clean

And right now, we have no node dependencies that are not necessary. The problem here is that we cannot cache those dependencies. There are two ways to resolve this problem. The first one is the multistage build, but it's the topic for another article, which will be available soon. The second option will be to move entire frontend building to the nginx Dockerfile.

Values for now

Applying all those changes, let's check how much build process time we get.

Old image build 4m28s* 901MB

New image build 3m57s* 711MB

So we safe almost 200MB for final image. Our build time is not much better than before, but let's check how our cache is working now:

Old image with cache 4m35s*

New image with cache 25.1s*

So yea, the cache is working better for our new image.

Do you really need node for running PHP application?

In our example Dockerfile we are building our frontend app in the backend container, and then copy it to our frontend container:

FROM nginx:latest

COPY --from=backend /app/public /app/public

COPY docker/nginx/default.conf /etc/nginx/default.conf

Why to not build our app directly in the frontend image.

FROM nginx:1.21.4
WORKDIR /app

COPY docker/nginx/default.conf /etc/nginx/default.conf

RUN curl -sS https://deb.nodesource.com/setup_12.x ./node_setup.bash | bash -

RUN apt install nodejs=12.22.7-deb-1nodesource1 && \
    apt purge -y --auto-remove

COPY . .

RUN npm install -g yarn

RUN yarn install && \
    yarn run build && \
    rm -rf node_modules && \
    yarn cache clean

And our backend Dockerfile

FROM php:7.4.25-fpm
WORKDIR /app

COPY --from=composer:2.1.11 /usr/bin/composer /usr/bin/composer

RUN apt update && \
    apt install -y \
    libicu-dev=67.1-7 \
    libgd-dev=2.3.0-2 \
    libonig-dev=6.9.6-1.1 \
    unzip=6.0-26 && \
    apt purge -y --auto-remove

RUN docker-php-ext-install \
    exif \
    gd \
    intl \
    mbstring \
    mysqli \
    opcache \
    pdo_mysql \
    sockets

ENV COMPOSER_ALLOW_SUPERUSER 1

COPY composer.json .
COPY composer.lock .
RUN composer install --no-dev --no-scripts

COPY . .
RUN composer dumpautoload --optimize

So right now our backend image without cache is building in 3m8s* and with cache in 6s*, and it weight is 597MB.

The frontend image is building 57s* and it weight is 310MB.

You can build them in parallel, so the final time could be the maximum time for one of the images.

Multistage builds

All of those changes may be even better with using feature called multistage builds.
This topic should be available soon in the next article on my blog 😃

Edit: It's now available

*All the times that appear in this article, I got on my Mac with intel i5 and 16GB RAM environment.

Please remember about using non root user in your docker images.

Originally posted on mateuszcholewka.com

Top comments (3)

Giovanni Lenoci • Nov 15 '21

And use alpine images 😉

Rokas Lakstauskas • Nov 19 '21

dude, this is great article.
but
github repo with code would be just icing on the cake, I instantly want to try code out
and maybe docker-compose.yml + launch instructions?

anyway it's not a critique, maybe my two dimes on what would be good to add.

Mateusz Cholewka • Nov 22 '21

I like this idea, thank you for this comment 😃
I will try to prepare something like this soon