Adnan Rahić

Posted on Nov 15, 2018 • Originally published at blog.sourcerer.io on Oct 18, 2018

A crash course on optimizing your Docker images for production

#docker #devops #node #showdev

Disclaimer: Zeet is sponsoring this blogpost for the next month. I tried it out the other day. It's like serverless but for running entire back ends. You can host and scale apps automagically. Pretty neat.

Don’t you hate it when deploying your app takes ages? Over a gigabyte for a single container image isn’t really what is viewed as best practice. Pushing billions of bytes around every time you deploy a new version doesn’t sound quite right for me.

TL;DR

This article will show you a few simple steps of how you can optimize your Docker images, making them smaller, faster and better suited for production.

The goal is to show you the size and performance difference between using default Node.js images and their optimized counterparts. Here’s the agenda.

Why Node.js?
Using the default Node.js image
Using the Node.js Alpine image
Excluding development dependencies
Using the base Alpine image
Using multistage builds

Let’s jump in.

Why Node.js?

Node.js is currently the most versatile and beginner friendly environment to get started on the back end, and I write it as my primary language, so you’ll have to put up with it. Sue me, right. 😙

As an interpreted language, JavaScript doesn’t have a compiled target, like Go for example. There’s not much you can do to strip the size of your Node.js images. Or is there?

I’m here to prove that to be wrong. Picking the right base image for the job, only installing production dependencies for your production image, and of course, using multistage builds are all ways you can drastically cut down the weight of your images.

In the examples below, I used a simple Node.js API I wrote a while back.

Using the default Node.js image

Starting out, of course, I used the default Node.js image pulling it from the Docker hub. Oh, how clueless I was.

FROM node
WORKDIR /usr/src/app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

Want to guess the size? My jaw dropped. 727MB for a simple API!?

Don’t do this, please. You don’t need to do this, honestly, just don’t.

Using the Node.js Alpine image

The easiest and quickest way to drastically cut down the image size is by choosing a much smaller base image. Alpine is a tiny Linux distro that does the job. Just by choosing the Alpine version of the Node.js will show a huge improvement.

FROM node:alpine # adding the alpine tag
WORKDIR /usr/src/app
COPY package.json package-lock.json ./
RUN npm install
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

A whole of six times smaller! Down to 123.1MB. That’s more like it.

Excluding development dependencies

Hmm… But there has to be something else we can do. Well, we are installing all dependencies, even though we only need production dependencies for the final image. How about we change that?

FROM node:alpine
WORKDIR /usr/src/app
COPY package.json package-lock.json ./
RUN npm install --production # Only install prod deps
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

There we go. We shaved another 30MB off! Down to 91.6MB now. We’re getting somewhere.

This had me quite proud of myself, and I was ready to call it a day. But then it hit me. What if I start with the raw Alpine image? Maybe it would be smaller if I grab the base Alpine image and install Node.js myself. I was right!

Using the base Alpine image

You’d think a move like this one would make little to no difference, but it shaved another 20MB off of the previous version.

FROM alpine # base alpine
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs nodejs-npm # install Node.js and npm
COPY package.json package-lock.json ./
RUN npm install --production
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

Down to 70.4MB now. That’s a whopping 10 times smaller than where we started!

Not much more we can do now, right? Right…?

Using multistage builds

Well, actually, there is. Let’s talk a bit about layers.

Every Docker image is built from layers. Each layer is a command in the Dockerfile. Here’s the file from above:

FROM alpine # base alpine
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs nodejs-npm # install Node.js and npm
COPY package.json package-lock.json ./
RUN npm install --production
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

The FROM instruction creates a layer, so does the WORKDIR, as well as RUN, etc. All the layers are read-only, except for the last one, the CMD, which is a writable layer. Read-only layers can be shared between containers, meaning one image can be shared between containers.

What’s going on here is that Docker uses storage drivers to manage read-only layers and the writable container layer. This is the ephemeral layer that gets deleted once a container is deleted. Really cool stuff. But why is this important?

By minimizing the number of layers, we can have smaller images. This is where using multistage builds steps in.

FROM alpine AS multistage
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs nodejs-npm
COPY package.json package-lock.json ./
RUN npm install --production

#

FROM alpine
WORKDIR /usr/src/app
RUN apk add --no-cache --update nodejs
COPY --from=multistage /usr/src/app/node\_modules ./node\_modules
COPY . .
EXPOSE 3000
CMD ["node", "app.js"]

We’re using the first image only to install the dependencies, then in our final image, we copy over all node_modules without building or installing anything. We can even skip installing npm in the final image as well!

Want to guess the final size? Go ahead!

I’d say we’ve done good, getting it down to 48.6MB, which is a 15x improvement, is something to be proud of.

The verdict

Don’t be naive, there’s absolutely no reason to have gigabyte-sized images in production. A great first step is to use a tiny base image. Start small, baby steps are fine.

By choosing optimized base images will get you a long way. If you really need the boost in deployment speed and are plagued with slow CI/CD pipelines, check out multistage builds. You won’t want to do it any other way in the future.

Note : I did leave out a sample where development dependencies are included for running tests before deploying to production, as it wasn’t relevant to the final size reduction for running in production. Of course, it’s a valid use-case! Feel free to add your ideas in the comments below. I’d love to hear what you think!

If you want to check out any of my previous DevOps related articles about Docker and Kubernetes, feel free to head over to my profile.

Hope you guys and girls enjoyed reading this as much as I enjoyed writing it. Do you think this tutorial will be of help to someone? Do not hesitate to share. If you liked it, smash the unicorn below so other people will see this here on DEV.to.

Top comments (9)

Vinay Hegde • Nov 16 '18

Great article for people like me grappling the nuances of Docker, Adnan!

Quick question - Is the final code snippet you've shared (starting with FROM alpine AS builder) and ending with (CMD ["node", "app.js"]) part of the same Dockerfile?

If yes, won't that add more layers & thus increase overall size? Please correct me if I'm wrong.

Adnan Rahić • Nov 17 '18

On the contrary. It'll reduce the number of layers. The builder image is an intermediary image and will not be part of the final image. The bottom part of the Dockerfile is where you create a fresh image and copy over the node_modules. In doing so drastically reducing the number of layers, and excluding build dependencies.

Vinay Hegde • Nov 17 '18

Thanks for the explanation that builder images are intermediate and not a part of the final one.

While I'll surely dabble with it myself for better clarity, from your experience so far - is this reduction possible for only Node apps or can be replicated for virtually any application?

Adnan Rahić • Nov 17 '18

It has nothing to do with the programming language or runtime. It solely has to do with the way you structure and build your images.

Joe Hobot • Nov 15 '18

Sweet, really nice written post.

Adnan Rahić • Nov 16 '18

Thanks Joe!

Antonio • May 16 '19

Great post. I'm starting to play with Docker so this is veery helpful! thanks.

Brodan • Nov 16 '18

Great timing on this post! I just joined a new company/team and one of my first tasks involves configuring/optimizing Docker! Looking forward to using what I learned from here!

Adnan Rahić • Nov 17 '18

That's awesome! Good luck. 😀