This post originally appeared on the Scout blog.
These days Docker is everywhere! Since this popular, open-source container tool first launched in 2013 it has gone on to revolutionize how we think about deploying our applications. But if you missed the boat with containerization and are left feeling confused about what exactly Docker is and how it can benefit you, then we’ve put together this post to help clear up any confusion you might have.
Docker is a tool that allows you to easily create, run and deploy your applications in containers. These lightweight, virtual containers can be easily deployed on a server without you having to worry about the specifics of the underlying system. This gives developers great power and flexibility and allows us to avoid “dependency hell” issues when deploying in different environments. Docker is an open source technology and it is built directly on top of Linux containers, which makes it much faster and more lightweight when compared to VMs (Virtual Machines).
A container can best be thought of as a standard unit of software. A self-contained package, if you like. Everything that is needed to run this software, such as code, libraries, tools and dependencies, are all packaged up together. Then this neat, little package can be replicated and cloned in different environments (staging, live system, experiments etc.) but it will always run in the same way, no matter what the underlying architecture is.
Now, you might think that this sounds very similar to a VM, and you’re right, it does! But there are some key differences with containers that make them particularly useful for the deployment of software applications. The key distinction between a container and a VM is that containers virtualize the OS, whereas VMs virtualize the hardware. This allows containers to share elements with each other, such as the underlying operating system kernel of the machine where the container is running. This gives more efficient performance, much smaller file sizes and faster deployments.
So how do we create one of these containers? First we start with an image, we customize it to our needs, and then we run it. So that leads us to the next question, what exactly is a Docker image?
An image is a shareable chunk of functionality such as a web server, or a database engine, or a Linux distribution. You can think of an image as a starting point for your containers, they are like blueprints. Each image is an untouched, fresh install of a complete environment. A container then, is a running instance of an image after it has been customized and set up.
These images can be hosted and downloaded in Docker Hub for reuse or in your own private repositories. To run an image inside a container, you need to use the Docker Engine. This running container is the complete package that we talked about earlier.
Docker Engine is the name of the runtime that makes this whole system work. When people refer to Docker, they are usually talking about the Docker Engine. Containers run inside this Docker Engine. The most common way to run and manage containers is by using the Docker CLI (Command Line Interface) application. The Docker CLI communicates with the Docker Engine.
The Docker CLI is how we communicate with the Docker Engine, so let’s take a look at how we can create, run and delete containers using this Docker CLI.
Running this command will list all the images that you have on your machine:
$ docker images
We can download an image from Docker Hub with the ‘pull’ command:
$ docker pull image-name
We can then create a container from an image and run it in a container with the ‘run’ command:
$ docker run image-name
If we want to see what containers exist (running or stopped) we can use ‘ps’ with the ‘-a’ flag for all:
$ docker ps -a
If you don’t have any images yet you can start with a simple hello world example from Docker Hub. The ‘run’ command will start a new container using the image that you specify. But if that image doesn’t exist it will try to look for it in Docker Hub:
$ docker run hello-world
To clean up afterwards you can find out the container’s ID by running ‘docker ps -a’, and then use the ‘docker rm’ command to delete it, or with this command below you can delete all stopped containers:
$ docker container prune
We’ve seen how to create and run containers from the command line, but in reality creating and running a container with many dependencies and start-up requirements can quickly become quite complex to do in a single command. Therefore, we can instead use a special file called a Dockerfile, which we can place inside our project’s directory and share in source control. Anybody with this Dockerfile can then run the same container in exactly the same way.
A Dockerfile is a file in which we can describe our Docker container. Here we can define things such as the name an image file to start from, where our application source code is located, and commands that need to be run to start our application. Using this Dockerfile, Docker has all the information that it needs to create and run our application inside a container. For a simple container, this one file is all we need to completely manage our container. So let’s take a look at what a sample Dockerfile for a Python project might look like.
# Start from the official Python image FROM python:3 # Make /code the working directory WORKDIR /code # Copy everything in the working directory to the /code directory inside the container COPY . /code # Use the Python installer to install packages defined in requirements.txt RUN pip install -r requirements.txt # When the container starts, run the file app.py CMD ["python", "app.py"]
Once we have our Dockerfile, we can create an image from it in the current directory like this (replace image-name with your own name):
$ docker build -t image-name .
And then we can run it as a container like this:
$ docker run image-name
In the real world, our applications span across multiple processes. For example, perhaps we have a web application, which sits on top of a database engine and interfaces with a REST API. How can we make that system work with containers?
Well the idea is that each container should do just one task, and so each of these separate parts of our system should be in their own container. This means we need a multi-container environment. So now we need to manage how these separate containers can work together and communicate with each other, and this is where Docker Compose comes in.
To use Docker Compose we need to create a docker-compose.yml file as well as a Dockerfile. The docker-compose.yml file ties together multiple containers which are referred to as services. In this example file, there is a “db” service and a “web” service. The “db” service uses the official PostgreSQL image and the “web” service uses an image that has been built in this current directory using a Dockerfile.
version: '3' services: db: image: postgres web: build: . command: python manage.py runserver 0.0.0.0:8000 volumes: - .:/code ports: - "8000:8000" depends_on: - db
There is also a separate CLI application for Docker Compose, which builds on top of the standard Docker CLI. Once we have a Dockerfile and a docker-compose.yml file set up we can run all our connected containers as one like this:
$ docker-compose up
Docker Hub is Docker’s official public repository of container images. Here you can find the official images for Linux distributions or database engines etc. which you can use in your own containers as starting points. For example, if your application uses a PostgreSQL database engine, then you can specify the official PostgreSQL image from Docker Hub in your Dockerfile to instantly use it.