DEV Community

Cover image for Stop Learning Docker for Dummies. Learn It Like a DevOps Engineer
Akhilesh Mishra
Akhilesh Mishra

Posted on • Originally published at akhileshmishra.substack.com

Stop Learning Docker for Dummies. Learn It Like a DevOps Engineer

Most Docker tutorials teach you commands.

Not how Docker actually works in real systems.

That's why people can run docker run but freeze when asked to debug a broken container in production.

This post walks you through them in order. Layer caching. Multi-stage builds. Networking. Volumes. ENTRYPOINT vs CMD. docker inspect. The concepts most DevOps engineers fake their way through until they get burned.

Let's go.

The problem Docker solved

The Docker story began with the problems people had 20-30 years ago.

You had bare hardware. You installed an OS on top. You compiled your app and sorted every dependency by hand.

Need to run another application? Buy another server. Spend days setting it up from scratch.

Then virtualization came. Hypervisors let you spin 10–20 VMs on the same hardware.

Better. But the dependency problem stayed.

You still had to install and configure everything on every single VM. Apps worked on one machine. Failed on another.

"Works on my machine" became the most dreaded phrase in software engineering.

Docker killed that phrase.

Why Docker images exist

You install software directly on the machine.

Works fine. Until your app needs Python 3.8 and another app on the same machine needs Python 3.11.

They conflict. One of them breaks.

You start managing dependencies manually. It becomes a full-time job.

So Docker introduced the image.

A Docker image packages everything your app needs. Code. Runtime. Libraries. Environment variables. Config files. All of it, in one artifact.

You build it once. Run it anywhere.

No more "works on my machine." The machine is inside the image.

But an image sitting on disk does nothing. You need to run it.

What is a Docker container

A container is a running process. A live instance of your image.

One image can spin up dozens of containers at the same time. On any machine. On any cloud.

docker run -d -t --name Thor alpine
docker run -d -t busybox
Enter fullscreen mode Exit fullscreen mode

This spins up two containers. Both are minimalist Linux images pulled from Docker Hub.

  • -d runs the container in the background
  • -t attaches a terminal
  • --name gives it a name. Skip it and Docker invents a random one for you

Images sitting idle are useless. Containers made them run.

Inspecting and managing running containers

You have containers running. You don't know which ones.

docker ps        # running containers only
docker ps -a     # all containers, including stopped ones
docker image ls  # see images on your machine
Enter fullscreen mode Exit fullscreen mode

Notice the size. Alpine is around 7MB. A full Ubuntu VM is gigabytes.

That's why you can run 50 containers where you might fit only 5 VMs.

To interact with a running container, exec into it:

# Run a single command inside the container
docker exec -t Thor ls
docker exec -t Thor ps

# Open an interactive shell
docker exec -it Thor sh
Enter fullscreen mode Exit fullscreen mode

-it opens an interactive terminal session. From here you can inspect the filesystem, check processes, read logs, debug your app live. Type exit to come back out.

Container lifecycle commands you'll use every day:

docker stop Thor        # gracefully stop a running container
docker start Thor       # start a stopped container
docker rm Thor          # remove a stopped container
docker rm -f Thor       # force remove a running container
Enter fullscreen mode Exit fullscreen mode

Containers are ephemeral by design. Stop them. Delete them. Spin new ones from the same image. Repeat. The image never changes. Only the containers do.

Building your own Docker image with a Dockerfile

Running other people's images only takes you so far. You need to package your own app.

That's what the Dockerfile does. A blueprint. A plain text file of instructions.

FROM python:3.11
WORKDIR /app
COPY requirements.txt /app
RUN pip install --no-cache-dir -r requirements.txt
COPY app.py /app
EXPOSE 5000
CMD ["python", "app.py"]
Enter fullscreen mode Exit fullscreen mode

What each instruction does:

  • FROM — sets the base image
  • WORKDIR — sets the working directory inside the container
  • COPY — copies files from your machine into the image
  • RUN — executes commands during the build
  • EXPOSE — documents which port your app listens on
  • CMD — the command that runs when the container starts

Build it:

docker build -t flask-image .
Enter fullscreen mode Exit fullscreen mode

The . tells Docker to look for the Dockerfile in the current directory.

But your builds keep taking 3 minutes even when you changed one line of code.

That's a layer caching problem.

Docker layer caching: why your builds are slow

Docker builds images in layers. Every instruction in your Dockerfile is a layer.

When you rebuild, Docker checks each layer top to bottom.

If nothing changed in that layer, Docker reuses the cached version and moves on. If something changed, Docker rebuilds that layer and every layer after it.

This is why order matters in your Dockerfile.

Wrong order — cache breaks every time:

COPY . /app                          # copies everything including app code
RUN pip install -r requirements.txt  # installs dependencies
Enter fullscreen mode Exit fullscreen mode

You change one line in app.py. Docker sees the COPY changed. It invalidates the cache. It reinstalls all your dependencies from scratch. Every single time.

Right order — cache works for you:

COPY requirements.txt /app           # copy only the dependency file first
RUN pip install -r requirements.txt  # cached until requirements change
COPY . /app                          # copy app code last
Enter fullscreen mode Exit fullscreen mode

You change app.py. Docker sees requirements.txt hasn't changed. It uses the cached pip install. Only the final COPY reruns.

Build goes from 3 minutes to 10 seconds.

The rule: copy what changes least, first. Copy what changes most, last.

Cache was there all along. You just had to stop fighting it.

Multi-stage Docker builds: shrink your image size

Your image is 1.2GB. That's a problem.

A large image means slow pulls across environments. More storage cost. More packages installed means more CVEs to patch. A bigger attack surface for anyone who gets inside.

The culprit is usually your base image.

python:3.11 is convenient. It is also built on Debian and ships with compilers, build tools, and packages your running app never needs.

Switch to python:3.11-slim and your image drops a lot. Switch to python:3.11-alpine and it drops further.

But sometimes you need the full build environment to compile dependencies. You just don't need it at runtime.

That's what multi-stage builds solve. Build in one image. Run in another.

# Stage 1: Build
FROM python:3.11 AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Run
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
COPY app.py .
EXPOSE 5000
CMD ["python", "app.py"]
Enter fullscreen mode Exit fullscreen mode

The first stage uses the full image to install everything.

The second stage starts clean from a slim base. It copies only what it needs from the builder — the installed packages and your app code. Nothing else.

Your final image has no compiler. No build tools. No leftover cache. Just your app and what it needs to run.

Image goes from 1.2GB to under 150MB. Fewer packages. Fewer CVEs. Faster pulls. Smaller attack surface.

Your build environment and your runtime environment are finally separate.

CMD vs ENTRYPOINT in Docker (the real difference)

Most tutorials treat them as the same thing. They are not.

CMD sets the default command. It can be completely overridden.

docker run flask-image python debug.py
Enter fullscreen mode Exit fullscreen mode

That python debug.py replaces your CMD entirely. The container runs your debug script instead.

ENTRYPOINT sets the command that always runs. It cannot be overridden — only appended to.

ENTRYPOINT ["python"]
CMD ["app.py"]
Enter fullscreen mode Exit fullscreen mode

Now python always runs. app.py is just the default argument.

docker run flask-image               # runs: python app.py
docker run flask-image debug.py      # runs: python debug.py
docker run flask-image --version     # runs: python --version
Enter fullscreen mode Exit fullscreen mode

The pattern in production: use ENTRYPOINT for the executable that never changes. Use CMD for the default arguments that might.

Treat them as the same and you hit confusing behavior when passing arguments to containers. Now you know why.

Pushing images to Docker Hub

Docker Hub is the public registry for images. Like GitHub, but for Docker images.

Tag your image first. The tag tells Docker where to push it.

docker tag flask-image yourusername/flask-demo:1.0
docker login
docker push yourusername/flask-demo:1.0
Enter fullscreen mode Exit fullscreen mode

Now anyone, on any machine, anywhere in the world, can pull and run your app:

docker pull yourusername/flask-demo:1.0
docker run -td -p 8080:5000 yourusername/flask-demo:1.0
Enter fullscreen mode Exit fullscreen mode

The image is portable. The environment comes with it. Same behavior everywhere.

But now you hit a new problem. Your container is running. You can't reach it.

Docker networking explained: bridge networks and port forwarding

By default, Docker uses a bridge network.

The container gets its own IP. Your host machine is on a different network. They cannot talk directly.

You try to curl your Nginx container from your laptop. It fails. The container is running. But it is isolated.

Port forwarding solves this:

docker run -t -d -p 5000:80 --name nginx-container nginx:latest
Enter fullscreen mode Exit fullscreen mode

This forwards port 80 inside the container to port 5000 on your host. Now localhost:5000 reaches your container.

But here's the other annoying thing about the default bridge network — containers on it cannot reach each other by name. Only by IP. And IPs change every time a container restarts. You cannot hardcode them.

User-defined Docker networks: how containers talk by name

docker network create my-network
Enter fullscreen mode Exit fullscreen mode

A user-defined network gives you two things the default bridge doesn't:

  1. Isolation — your containers live in their own network, separate from the host
  2. Name resolution — containers can reach each other by name, not by IP
docker run -itd --network my-network --name web-app nginx
docker run -itd --network my-network --name api-app busybox
Enter fullscreen mode Exit fullscreen mode

Now web-app can ping api-app by name. Restart either one with a new IP. Name resolution still works.

In production, your app container talks to your database container by name. This is how.

You can also inspect networks to see who's connected:

docker network ls
docker network inspect my-network
Enter fullscreen mode Exit fullscreen mode

Containers could not find each other reliably. User-defined networks fixed that.

Docker volumes: persistent storage for containers

You stop a container. Start it again. The data inside is gone. Logs. Database writes. Uploaded files. All wiped.

Containers are ephemeral. That is by design. But data cannot be.

Volumes solve this. A volume is storage managed by Docker that lives outside the lifecycle of any container.

docker volume create mydata
docker run -d --mount source=mydata,target=/app nginx:latest
Enter fullscreen mode Exit fullscreen mode

The container writes to /app. Docker stores that data in the volume.

Delete the container. Create a new one. Mount the same volume. The data is still there.

You can also mount the same volume into multiple containers:

docker run -td --mount source=mydata,target=/app/log --name container-1 busybox
docker run -td --mount source=mydata,target=/app/log --name container-2 busybox
Enter fullscreen mode Exit fullscreen mode

Anything container-1 writes to /app/log, container-2 can read. And vice versa.

Your log aggregator reads what your app writes. Your backup container reads what your database writes.

Volume commands you'll use regularly:

docker volume create mydata     # create a volume
docker volume ls                # list volumes
docker volume inspect mydata    # see details including mount path on host
docker volume rm mydata         # delete a volume
Enter fullscreen mode Exit fullscreen mode

Containers were ephemeral. Volumes made data survive them.

Debugging Docker containers in production with docker inspect

This is where most engineers get stuck. They can start containers. They cannot debug them.

docker inspect is the command that changes that.

docker inspect flask-container
Enter fullscreen mode Exit fullscreen mode

It returns everything Docker knows about a running container. In JSON.

The things you actually use it for in production:

# What network is this container on? What IP did it get?
docker inspect flask-container | grep -A 20 "Networks"

# What volumes are mounted? Where do they point on the host?
docker inspect flask-container | grep -A 10 "Mounts"

# What environment variables were actually injected at runtime?
docker inspect flask-container | grep -A 20 "Env"

# Is the health check passing or failing?
docker inspect flask-container | grep -A 10 "Health"
Enter fullscreen mode Exit fullscreen mode

The difference between what you think is running and what is actually running often lives in this output.

Container behaving differently in staging than local? Check the environment variables that were actually injected. Volume not persisting? Check where it is actually mounted. Network connectivity failing? Check which network the container actually joined.

docker inspect shows you reality. Everything else is assumption.

Wrapping up

These are the advanced Docker concepts that separate engineers who run containers from engineers who ship them to production:

  • Layer caching — copy what changes least, first
  • Multi-stage builds — separate build environment from runtime
  • CMD vs ENTRYPOINT — defaults vs always-runs
  • User-defined networks — name resolution between containers
  • Volumes — make data survive containers
  • docker inspect — see what's actually running, not what you think is running

Every one of these exists because something broke in production. Now you know the story behind each.

If you are serious about production-grade DevOps — not tutorial DevOps — I run a 25-week live bootcamp covering AWS, Kubernetes, MLOps, and AIOps. Real projects. Real troubleshooting. The kind of skills that actually show up in interviews.

[25-Week AWS DevOps + MLOps + AIOps Bootcamp →]

Top comments (0)