DEV Community

Cover image for Understanding CMD, ENTRYPOINT and RUN in Docker - An Interlude Post
Kostas Kalafatis
Kostas Kalafatis

Posted on

Understanding CMD, ENTRYPOINT and RUN in Docker - An Interlude Post

Disclaimer: This post is heavily inspired by this post on the Docker Blog. It is also seeks to answer questions raised by the community in previous posts in this series.

Docker is undeniably a powerful tool for containerization, but its versatility and extensive feature set can be overwhelming, especially for newcomers. The platform offers multiple ways to achieve similar goals, making it crucial for users to carefully weigh the advantages and disadvantages of each option to determine the best approach for their specific projects. This decision-making process requires a solid understanding of Docker's underlying mechanisms and the trade-offs involved in different strategies.

Docker can be a bit of a maze, especially when it comes to the nitty-gritty of instructions like RUN, CMD, and ENTRYPOINT. These three instructions often cause confusion, as they seem to have overlapping functions. But fear not! In this article, we'll unravel the mystery, clearly explaining the differences between them and highlighting when to use each one effectively. By the end, you'll be a Dockerfile ninja, wielding these instructions with confidence and precision!

The RUN Directive

In Docker, the RUN directive is a key instruction used in Dockerfiles to run commands while building your Docker image. It's your go-to tool for setting up the environment inside the image, allowing you to install packages, update dependencies, configure settings, and basically do anything you need to make your image ready to use. Think of it as the stage where you get everything in place before the curtain goes up on your running container.

Here is an example of the RUN directive in practice:

FROM ubuntu:20.04

# Update package lists and install necessary packages
RUN apt-get update && apt-get install -y \
    curl \
    wget \
    && rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode

In this example, we use the RUN instruction to update the package lists on your system and install curl and wget. These are handy tools for downloading files and interacting with websites from the command line.

The && rm -rf /var/lib/apt/lists/* part is added for cleanup. It removes the cached package lists after the installation, making your final Docker image smaller and leaner. Since these lists aren't needed to run your application, it's good practice to get rid of them.

The CMD Directive

The CMD instruction in your Dockerfile sets the default command that will run when you start a container from your image. It's like a pre-set option for what your container should do when it "boots up." But here's the key: you can easily change this default behavior by providing different command-line arguments when you run the container using docker run.

CMD is perfect for situations where you want to provide a sensible default behavior for your container, but also give users the flexibility to customize it. Think of it as setting up a suggested starting point, but allowing users to take the wheel if they want to go in a different direction.

It's a common practice to use CMD in Docker images to define default parameters or configurations that can be easily overridden by the user when running the container.

For example, by default, you might want to start a web server to start, but also allow users to override this and run a shell instead:

FROM node:alpine

[...dockerfile truncated...]

CMD ["node", "server.js"]
Enter fullscreen mode Exit fullscreen mode

The ENTRYPOINT Directive

The ENTRYPOINT instruction sets the main executable for your container. It's similar to CMD but with a key difference: when you run a container with docker run, the command you provide doesn't replace the ENTRYPOINT command. Instead, your command gets added to the ENTRYPOINT command as arguments.

Think of it like this: ENTRYPOINT sets the core command your container is designed to run, while any arguments you provide with docker run become extra instructions for that command.

For example, the following Dockerfile will run the webserver no matter what the users provide as arguments:

FROM node:alpine

[...dockerfile truncated...]

ENTRYPOINT ["node", "server.js"]
Enter fullscreen mode Exit fullscreen mode

So CMD or ENTRYPOINT?

CMD and ENTRYPOINT are special instructions in a Dockerfile. While other instructions execute when you build the image, these two come into play when you actually run a container from that image.

Essentially, when a container starts, Docker needs to know what it should do – what program to run, and how to run it. That's where CMD and ENTRYPOINT step in. They tell Docker what the main process inside the container is and how to start it.

Now, the difference between them is a bit tricky, and many people don't fully grasp it. Luckily, most of the time, your container will work fine even if you don't use them perfectly. However, understanding the nuances can make things a lot smoother and less confusing.

To get a better handle on this, let's break down a typical Linux command:

ping -c 10 127.0.0.1
Enter fullscreen mode Exit fullscreen mode

Here, ping is the command itself, and the rest (-c 10 127.0.0.1) are the parameters or arguments we're giving to that command.

Now, back to Docker:

  • ENTRYPOINT: This is where you define the command part of the expression. It's the core thing you want your container to do when it starts.
  • CMD: This is where you define the parameters for that command. They're the additional instructions you want to give it.

So, a Dockerfile that uses Alpine Linux as the base image and wants to run the ping command could look like this:

FROM alpine:latest

ENTRYPOINT ["ping"]
CMD ["-c", "10", "127.0.0.1"]
Enter fullscreen mode Exit fullscreen mode

We can now build an image called pinger from the preceding Dockerfile, as follows:

docker image build -t pinger .
Enter fullscreen mode Exit fullscreen mode

Now we can run a container from the pinger image we just created like this:

docker container run --rm -it pinger
Enter fullscreen mode Exit fullscreen mode
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.047 ms
64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.056 ms
64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.038 ms
64 bytes from 127.0.0.1: seq=3 ttl=64 time=0.055 ms
64 bytes from 127.0.0.1: seq=4 ttl=64 time=0.037 ms
64 bytes from 127.0.0.1: seq=5 ttl=64 time=0.036 ms
64 bytes from 127.0.0.1: seq=6 ttl=64 time=0.053 ms
64 bytes from 127.0.0.1: seq=7 ttl=64 time=0.058 ms
64 bytes from 127.0.0.1: seq=8 ttl=64 time=0.048 ms
64 bytes from 127.0.0.1: seq=9 ttl=64 time=0.053 ms

--- 127.0.0.1 ping statistics ---
10 packets transmitted, 10 packets received, 0% packet loss
round-trip min/avg/max = 0.036/0.048/0.058 ms
Enter fullscreen mode Exit fullscreen mode

The great thing about this setup is that you can easily override the default CMD parameters you've set in the Dockerfile. If you remember, we originally defined CMD ["-c", "10", "127.0.0.1"] to ping a specific address three times.

But now, when you create a new container, you can simply add different values at the end of your docker run command to change the ping target or the number of pings. This gives you a lot of flexibility while still keeping a consistent base command.

docker container run --rm -it pinger -w 5 127.0.0.1
Enter fullscreen mode Exit fullscreen mode
PING 127.0.0.1 (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: seq=0 ttl=64 time=0.034 ms
64 bytes from 127.0.0.1: seq=1 ttl=64 time=0.053 ms
64 bytes from 127.0.0.1: seq=2 ttl=64 time=0.040 ms
64 bytes from 127.0.0.1: seq=3 ttl=64 time=0.038 ms
64 bytes from 127.0.0.1: seq=4 ttl=64 time=0.056 ms
Enter fullscreen mode Exit fullscreen mode

This will cause the container to ping the loopback IP address (127.0.0.1) for 5 seconds.

Directive Description and Use Cases

The following table provides an overview of these commands and use cases.

Directive Description Syntax Example Use Cases
RUN Executes commands during the image build process. It is used to install software packages, configure settings, or perform other setup tasks. The result is part of the image layers. RUN apt-get update && apt-get install -y nginx - Installing software packages
- Configuring files or repositories
- Setting up environment variables or system settings
CMD Specifies the default command to run when a container starts from the image. It can be overridden by providing a command in docker run. It is generally used to provide default arguments to ENTRYPOINT or to set a default command. CMD ["nginx", "-g", "daemon off;"] - Setting a default command for the container
- Providing default arguments to ENTRYPOINT
- Running a service or application by default
ENTRYPOINT Defines the main command that is always executed when a container starts. It is not overridden by arguments provided in docker run unless you use the --entrypoint flag. Useful for ensuring a specific command is always run. ENTRYPOINT ["nginx", "-g", "daemon off;"] Ensuring a specific command is always executed
- Running a primary process or service
- Used in combination with CMD to provide default arguments

Unix Signaling and the PID 1

In the world of Unix-like systems, including Docker containers, PID 1 is a special process – the very first one that starts up. Every other process inside the system is its child, creating a family tree of processes with PID 1 at the top.

In Docker, this PID 1 process is really important because it's responsible for managing everything else inside the container. One of its critical roles is handling signals (like SIGTERM) from the host system, which are essentially messages that tell the container to do something (like gracefully shut down).

Now, when you use the shell form for a Docker command, a shell process (usually /bin/sh -c) takes over as PID 1. The problem is that this shell process isn't very good at handling signals. It might not pass them along to your actual application, which can lead to issues like containers not shutting down cleanly.

On the other hand, the exec form lets you run your command directly as PID 1, without any shell in between. This way, the command itself receives and handles signals directly, making the whole thing much more reliable.

So, if your container needs to react to signals promptly and gracefully, the exec form is the way to go. It's particularly crucial for applications that need to respond to events or interruptions, ensuring that everything shuts down properly and your data stays safe.

Shell and exec forms

When you're working with Dockerfiles and setting commands for RUN, CMD, and ENTRYPOINT, you have two ways to do it: the shell form and the exec form. Each has its own strengths and weaknesses, and understanding the difference is key to effectively managing your Docker containers.

Shell Form

The shell form is like writing commands the old-school way, similar to what you'd type in a terminal. When Docker sees a command in shell form, it essentially runs it through a shell (usually /bin/sh -c for Unix-based images). This means the command gets interpreted by the shell, which unlocks some handy features like using environment variables and chaining commands together.

CMD echo "Hello, World!"
Enter fullscreen mode Exit fullscreen mode

This way of writing commands, using the shell form, means that Docker executes the command echo "Hello, World!" through the default shell of your container. This can be handy for simple commands or when you need to use special shell features. However, it has a drawback: it doesn't handle signals (like those used for managing processes) the same way as the exec form.

In other words, if your command needs to respond properly to Unix signals for things like stopping or restarting processes gracefully, the shell form might not be the best choice. It's a bit like having a middleman who might not deliver your messages as reliably as you'd like.

Exec Form

Now, let's talk about the exec form. It's a more specific and reliable way to define commands in your Dockerfile. Instead of writing the whole command as a single line, you break it up into a JSON array, where each part of the command (the command itself and its arguments) is a separate element in the array.

CMD ["echo", "Hello, World!"]
Enter fullscreen mode Exit fullscreen mode

Here, instead of using a shell as a middleman, Docker directly executes the command echo with the argument "Hello, World!". This direct execution method is more reliable because it doesn't depend on the shell's interpretation.

This has a few key advantages:

  • Better Signal Handling: The command itself can receive and respond to Unix signals directly. This is crucial for ensuring your application can gracefully shut down or restart when needed.

  • Precise Argument Passing: You can be confident that the arguments you provide are passed directly to your command without any potential modifications or interference from the shell.

Key Differences between Shell and Exec

Shell Form Exec Form
Form Commands without square brackets ([]). Run by the container's shell Commands with square brackets ([]). Run directly, not through a shell
Variable Substitution Inherits environment variables from the shell, such as $HOME and $PATH Does not inherit shell environment variables but behaves the same for ENV instruction variables
Shell Features Supports sub-commands, piping output, chaining commands, I/O redirection, etc. Does not support shell features
Signal Trapping & Forwarding Most shells do not forward process signals to child processes. Directly traps and forwards signals like SIGINT
Usage with ENTRYPOINT Can cause issues with signal forwarding Recommended due to better signal handling
Usage as ENTRYPOINT Params Not possible with the shell form If the first item in the array is not a command, all items are used as parameters for the ENTRYPOINT

The diagram below, will help you decide if you need RUN, CMD or ENTRYPOINT in your Dockerfile.

Image description

The diagram below, will help you decide if you need shell or exec form for your commands.

Image description

You can find high resolution diagrams here.

Running CMD and ENTRYPOINT

The following examples will walk you through the high-level differences between CMD and ENTRYPOINT, through an example

First lets create our Dockerfile

# Use the Ubuntu 20.04 image as the base image
FROM ubuntu:20.04

# Update the image and install traceroute
RUN apt-get update && apt-get upgrade -y
RUN apt-get install -y traceroute

# Set the default command
CMD traceroute
Enter fullscreen mode Exit fullscreen mode

Then build your image using the docker build -t tracert command.

Run the container image with CMD traceroute

Without passing any arguments, we get the following output.

docker run tracert

Usage:
  traceroute [ -46dFITnreAUDV ] [ -f first_ttl ] [ -g gate,... ] [ -i device ] [ -m max_ttl ] [ -N squeries ] [ -p port ] [ -t tos ] [ -l flow_label ] [ -w MAX,HERE,NEAR ] [ -q nqueries ] [ -s src_addr ] [ -z sendwait ] [ --fwmark=num ] host [ packetlen ]
Options:
  -4                          Use IPv4
  -6                          Use IPv6
  -d  --debug                 Enable socket level debugging
  -F  --dont-fragment         Do not fragment packets
  -f first_ttl  --first=first_ttl
                              Start from the first_ttl hop (instead from 1)
  -g gate,...  --gateway=gate,...
                              Route packets through the specified gateway
                              (maximum 8 for IPv4 and 127 for IPv6)
  -I  --icmp                  Use ICMP ECHO for tracerouting
  -T  --tcp                   Use TCP SYN for tracerouting (default port is 80)
  -i device  --interface=device
                              Specify a network interface to operate with
  -m max_ttl  --max-hops=max_ttl
                              Set the max number of hops (max TTL to be
                              reached). Default is 30
[...output truncated...]
Enter fullscreen mode Exit fullscreen mode

However, if we provide an IP address to test, we will get an error

docker run tracert www.dev.to                                                      docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "www.dev.to": executable file not found in $PATH: unknown.
Enter fullscreen mode Exit fullscreen mode

The problem is that the string you're providing on the command line, www.dev.to, is completely replacing the CMD instruction in your Dockerfile. Since that URL isn't a valid command, it's causing an error.

To fix this, you need to specify the actual command you want to run alongside the URL.

docker run tracert traceroute www.dev.to
traceroute to www.dev.to (104.18.26.242), 30 hops max, 60 byte packets
[...output truncated...]
Enter fullscreen mode Exit fullscreen mode

Run the container image with ENTRYPOINT traceroute

In this updated version, we'll make a change to the Dockerfile. We will remove the original CMD instruction and replace it with ENTRYPOINT ["traceroute"]. This means the traceroute command is now the main command that this container will run when it starts.

The ENTRYPOINT instruction works a bit differently than CMD. With ENTRYPOINT, you can't simply override the command by typing something different after docker run. Instead, any extra arguments you add after docker run are treated as arguments for the traceroute command itself.

# Use the Ubuntu 20.04 image as the base image
FROM ubuntu:20.04

# Update the image and install traceroute
RUN apt-get update && apt-get upgrade -y
RUN apt-get install -y traceroute

# Set the default command
ENTRYPOINT ["traceroute"]
Enter fullscreen mode Exit fullscreen mode

Let's see what happens when we try that

docker run tracert www.dev.to
traceroute to www.dev.to (104.18.26.242), 30 hops max, 60 byte packets
[...output truncated...]
Enter fullscreen mode Exit fullscreen mode

The key point here is to use ENTRYPOINT when you want to ensure that a specific executable is always run when your container starts, regardless of any additional commands or arguments the user might provide. It gives you a way to create containers that behave like self-contained tools, with a clearly defined main purpose.

Summary

Deciding when to use RUN, CMD, or ENTRYPOINT, and whether to choose the shell or exec form, demonstrates the level of detail and flexibility Docker offers. Each of these commands plays a distinct role in the Docker ecosystem, influencing how containers are constructed, behave, and interact with their environment.

By carefully selecting the right command and form for each specific situation, developers can create Docker images that are more dependable, secure, and optimized for efficiency. Mastering these Docker commands and their formats is essential for unlocking the full potential of Docker. When these best practices are followed, applications deployed within Docker containers can achieve peak performance across diverse environments, enhancing both development workflows and production deployments.

Top comments (1)

Collapse
 
deadreyo profile image
Ahmed Atwa

Great work diving into the PID and the core differences between the different modes instead of just feature comparison. Thanks!