loading...
Cover image for Docker Best Practices: Images

Docker Best Practices: Images

grigorkh profile image Grigor Khachatryan ・4 min read

Understand build context

When you issue a docker build command, the current working directory is called the build context. To exclude files not relevant to the build (without restructuring your source repository) use a .dockerignore file. This file supports exclusion patterns similar to .gitignore files. For information on creating one, see the .dockerignore file.

Non-root user

By default, Docker runs container as root which inside of the container can pose as a security issue. You would want to run the container as an unprivileged user wherever possible.

Minimizing number of layers

Try to reduce the number of layers that will be created in your Dockerfile. The instructions RUN, COPY, ADD create layers. Other instructions create temporary intermediate images, and do not directly increase the size of the build.

Example :

Suppose you need to get a zip file and extract it and remove the zip file. There are two possible ways to do this.

COPY <filename>.zip <copy_directory>
RUN unzip <filename>.zip
RUN rm <filename>.zip

or in one RUN block:

RUN curl <file_download_url> -O <copy_directory> \
&& unzip <copy_directory>/<filename>.zip -d <copy_directory> \
&& rm <copy_directory>/<filename>.zip

The first method will create three layers and will also contain the unwanted .zip in the image which will increase the image size as well. However, the second method only creates a single layer and is thus preferred as the optimum method, as long as minimizing the number of layers is the highest priority. It has the drawback, however, that changes to any one of the instructions will cause all instructions to execute again — something the docker build cache mechanism will avoid. Choose the strategy that works best for your situation.

For more detailed information see Best practices for writing Dockerfiles.

However, sometimes executing all commands in one RUN block can make the script more opaque, especially when trying to mix and match && and || statements. An alterative syntax is to use line continuation as you normally would, but explicitly switch on the shell's "exit on error" mode.

RUN set -e ;\
    echo 'successful!' ;\
    echo 'but the next line will exit: ' ;\
    false ;\
    causing this line not to run

# now you can use traditional shell flow of control without worry:
RUN set -e ;\
    echo 'next line will take evasive action' ;\
    if false; then \
      echo 'it seems that was false' >&2 ;\
    fi ;\
    echo 'and the script continues'

Use multi-stage builds

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don’t want in the final image.To show how this works, let’s take a look at the Dockerfile:

FROM golang:1.7.3
WORKDIR /go/src/github.com/alexellis/href-counter/
RUN go get -d -v golang.org/x/net/html  
COPY app.go .
RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o app .

FROM alpine:latest  
RUN apk --no-cache add ca-certificates
WORKDIR /root/
COPY --from=0 /go/src/github.com/alexellis/href-counter/app .
CMD ["./app"]  

Minimizing layer size

Some installations create data that isn’t needed. Try to remove this unnecessary data within layers:

RUN yum install -y epel-release && \
    rpmkeys --import file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7 && \
    yum install -y --setopt=tsflags=nodocs bind-utils gettext iproute\
    v8314 mongodb24-mongodb mongodb24 && \
    yum -y clean all

Avoid npm and pm2 in CMD

When creating an image, you can bypass the package.json’s start command and bake it directly into the image itself. First off this reduces the number of processes running inside of your container. Secondly it causes exit signals such as SIGTERM and SIGINT to be received by the Node.js process instead of npm swallowing them.

CMD ["node","index.js"]

RUN-only environment variables
If one needs an environment variable set during a RUN block, but it is either unnecessary, or potentially disruptive to downstream images, then one can set the variable in the RUN block instead of using ENV to declare it globally in the image:

RUN export DEBIAN_FRONTEND=noninteractive ;\
    apt-get update ;\
    echo and so forth

Tagging

Use tags to reference specific versions of your image.

Tags could be used to denote a specific Docker container image. Hence, the tagging strategy must include a unique counter like build id from a CI server (e.g. Jenkins) to help with identifying the correct image.

For more detailed information see The tag command.

Log Rotation

Use --log-opt to allow log rotation. This helps if the containers you are creating are too verbose and are created too often due to a continuous deployment process. For more detailed information, see the log driver options.

How to host a coder dinner-party

Originaly published here: https://medium.com/devgorilla/docker-best-practices-images-98e9464cc173

Posted on by:

grigorkh profile

Grigor Khachatryan

@grigorkh

Chief Architect at Mappr, Vectuel and RodinVR (#Techstars'17), #VR/#AR enginner, open source enthusiast. Thoughts on #IoT, #BigData, #DataScience, #AI & #Docker.Conducted 2500 job interviews.

Discussion

markdown guide
 

I see the recommendation of non root user not a first time, but still haven't seen just a real example when it's actually could harm. Also if Google dockerfile examples for production it seems almost everyone do not create a new user for running an app. Would be happy to finally find out what the case we need to scare about?

 

There's really no issue... so long as you run the container unprivileged, and totally isolated from the host. Because you can't guarantee how containers based on your image will be run, however, it's important to assume it will be run privileged, or not 100% isolated. If process within a container is running as root, they'll have unrestricted access to each of these egress to the host and could give you problems if compromised.

And you may say that you'd have to be a total noob to run a container wrong. But it's actually pretty easy to run a container sorta privileged:

  • Mounting any volumes from the host? Anything with root access in the container will now be able to create, delete, or modify files with any UID or GID in that volume directory.

  • Binding a privileged port on the host to a port (even privileged) in the container? Any root process in the container can now listen on that privileged port on the host. This could allow folks to turn your host into an official-looking server by compromising the root processes in the container.

  • Mounting in the docker socket? Any root process in the container can now control the docker daemon on the host. This means a compromised process can start 100% privledged containers with the host's root directory mounted in as a volume.

  • Imagine a Linux kernel user namespace bug that could only be exploited if the namespace user is root. Hypervisor escapes are a real thing. Namespace escapes will be too. See for example:
    lkml.org/lkml/2013/3/4/70
    "It'll abuse the above request_module() call to load any module the user requests -- iregardless of being contained in a user ns or not."

...and the list goes on.

But it all comes down to someone running a container in a way that's not isolated from the host. As long as you can guarantee that that will never happen, and barring bugs in Linux Containers, you are good to run root processes in Docker containers.

More ref:
gist.github.com/FrankSpierings/5c7...

security.stackexchange.com/questio...

 

Great response, thanks a lot for that extensive explanation. Now I see that in my case I should not worry about it.