Nishant Gaurav

Posted on Jun 8

You Don't Have a "Works on My Machine" Problem. You Have a Dependency Problem.

#webdev #devops #backend #docker

Every developer has said it. You write code, it runs perfectly on your laptop, you push it to a server or hand it to a teammate, and something breaks. Wrong Python version. A library installed globally that you forgot about. An environment variable that was set in your shell profile but nowhere else.

Docker doesn't fix your code. It fixes the gap between the environment where you wrote your code and the environment where it runs. That's the whole job. And once you see that clearly, everything else about Docker follows logically.

The Problem: Environments Are Invisible Dependencies

Here's a scenario that plays out constantly on engineering teams.

A developer builds a data processing script. It works on their machine. They hand it to a teammate. The teammate runs it and it crashes because a library is the wrong version, or isn't installed at all. So they spend an hour aligning their setups. Two weeks later, it's deployed to production. It crashes again because the server is running a different OS version, and a native dependency that compiled fine locally doesn't compile there.

None of this is a code problem. The code is fine. The problem is that every machine carries hidden state: installed packages, runtime versions, system libraries, OS quirks. And your application depends on all of it, invisibly.

The traditional solutions are painful. You write a requirements.txt or a package.json and hope everyone runs it. You write a setup script. You document the expected environment in a README that gets out of date. You use a virtual environment, which helps for language-level dependencies but doesn't capture system-level ones. You use a VM, which captures everything but is heavy, slow to start, and awkward to share.

Docker solves this by packaging the application and its entire environment (OS libraries, runtime, config, dependencies) into a single portable unit. That unit runs the same way everywhere Docker is installed.

How Docker Actually Works

The core idea: containers vs virtual machines

The most common misconception about Docker is that it's a lightweight virtual machine. It isn't, and the difference matters.

A virtual machine emulates hardware. When you run a VM, your computer pretends to be a different computer, running a full operating system on top of fake hardware. That's expensive. A VM typically takes seconds to minutes to start and consumes gigabytes of RAM just for the OS layer.

Docker uses a different mechanism called containerisation. Containers don't emulate hardware. They run as isolated processes directly on your actual OS kernel, using two Linux features to create the illusion of separation.

Namespaces give each container its own view of the system. A container has its own process list, its own network interfaces, its own filesystem tree. From inside the container, it looks like it's the only thing running.

cgroups (control groups) limit how much CPU, memory, and I/O a container can consume. This is how Docker prevents one container from starving others.

Because containers share the host kernel rather than running their own, they start in milliseconds and use a fraction of the memory a VM would. A container running a Python web app might consume 50MB of RAM. A VM running the same app would consume 50MB for the app plus roughly 500MB for the guest OS.

Images and containers

Docker has two concepts you need to keep straight: images and containers.

An image is a read-only template. It describes an environment: which OS layer to use, which packages to install, which files to include, which command to run on startup. An image is not running anything. It's a blueprint, like a class definition in code.

A container is a running instance of an image. When you tell Docker to run an image, it creates a container: a live, isolated process using that image as its starting state. You can run ten containers from the same image simultaneously and each will be isolated from the others. This is the class-vs-object analogy made literal.

Images are built in layers. Every instruction in your image definition adds a layer on top of the previous one: start with a base OS layer, add a Python layer, add your application code. Docker caches each layer separately. If you rebuild an image after changing only your application code, Docker reuses the OS and Python layers from cache and only rebuilds the final layer. This makes rebuilds fast.

The Dockerfile

You define an image with a Dockerfile, a plain text file that lists the build instructions.

# Start from an official Python base image (OS + Python bundled together)
FROM python:3.11-slim

# Set the working directory inside the container
WORKDIR /app

# Copy and install dependencies first so this layer gets cached
# as long as requirements.txt doesn't change
COPY requirements.txt .
RUN pip install -r requirements.txt

# Now copy your application code (this layer rebuilds on every code change)
COPY . .

# Tell Docker what to run when the container starts
CMD ["python", "app.py"]

The ordering here is deliberate. requirements.txt is copied and installed before the application code. Because Docker caches layers, if you change your code but not your dependencies, Docker will reuse the cached dependency layer and only redo the final COPY. If you put COPY . . first, every code change would invalidate the dependency cache and reinstall everything, turning a two-second rebuild into a two-minute one.

The Docker Hub and base images

You don't build images from scratch. You start FROM an existing image, usually an official one from Docker Hub, which is a public registry of images. python:3.11-slim is an image that includes Debian Linux and Python 3.11, maintained by the Python team. There are official base images for nearly every language runtime, database, and common tool.

The slim suffix is a convention for smaller variants that strip out documentation and tools you don't need at runtime. For production use, smaller images mean faster deploys and a smaller attack surface.

Docker Compose

A real application is rarely just one container. You have your app, a database, maybe a caching layer. Docker Compose lets you define and manage multiple containers as a single unit.

# docker-compose.yml defines a web app and its database together
services:
  web:
    build: .                    # build from the Dockerfile in this directory
    ports:
      - "8000:8000"             # map port 8000 on your machine to port 8000 in the container
    depends_on:
      - db                      # don't start web until db is running
    environment:
      DATABASE_URL: postgres://user:pass@db:5432/mydb

  db:
    image: postgres:15          # use the official Postgres image directly
    environment:
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
      POSTGRES_DB: mydb

With this file in your project, docker compose up starts both containers with the right configuration. A new developer on your team clones the repo and runs one command. No database installation required.

Where It Gets Complicated

Persistence: containers are ephemeral by default

A container's filesystem is disposable. When you delete a container, anything written inside it is gone. For a stateless web app that's fine because you rebuild and redeploy. For a database, it's a disaster.

Docker handles this with volumes: directories on your host machine (or a managed storage backend) that get mounted into containers. Anything the container writes to the volume path persists even after the container is removed.

Beginners consistently skip volumes when running databases locally, then wonder why their data disappears every time they restart. Add this to your Compose file for the database:

db:
  image: postgres:15
  volumes:
    - postgres_data:/var/lib/postgresql/data   # named volume for persistence

volumes:
  postgres_data:   # declare the named volume

Image size creep

The single most common mistake when writing Dockerfiles is producing images that are far larger than they need to be. A Python image built naively can easily reach 1-2GB. The production version of the same app can be under 100MB with multi-stage builds.

Multi-stage builds let you use one image to build your application (with compilers, test tools, all of it) and a separate, leaner image to run it. The final image only contains what the runtime needs.

# Stage 1: build stage with full tooling
FROM python:3.11 AS builder
WORKDIR /app
COPY requirements.txt .
RUN pip install --prefix=/install -r requirements.txt

# Stage 2: runtime stage, slim and minimal
FROM python:3.11-slim
WORKDIR /app
COPY --from=builder /install /usr/local   # copy only the installed packages
COPY . .
CMD ["python", "app.py"]

Secrets and environment variables

Don't put secrets in your Dockerfile. If you write ENV API_KEY=abc123 in a Dockerfile and push it to a registry, that key is baked into every layer of the image and visible to anyone who pulls it.

Pass secrets at runtime using environment variables, Docker secrets, or a tool like Vault. Your Compose file can reference variables from a .env file that you never commit to source control.

Networking between containers

Containers on the same Docker network can reach each other by service name. In the Compose example above, the web app connects to db:5432. The name db is not a hostname you configured anywhere. Docker's internal DNS resolves it to the database container automatically. This trips up beginners who try to use localhost inside a container and can't reach a sibling container. Inside a container, localhost refers to that container, not the host machine or anything running alongside it.

A Concrete Scenario: Onboarding Without the Setup Tax

Here's where the value becomes tangible. You're working on a team building a backend API in Python. The app needs Python 3.11, PostgreSQL 15, and Redis. Without Docker, onboarding a new engineer means installing Python 3.11 (maybe using pyenv, maybe not), installing and configuring PostgreSQL, installing Redis, setting a dozen environment variables, and running migrations. This takes anywhere from 30 minutes to half a day, and something always breaks on someone's machine.

With Docker, the repository contains a docker-compose.yml and a .env.example. The onboarding instruction is two commands:

cp .env.example .env
docker compose up

The first run downloads base images, which takes a few minutes once. Every subsequent run starts in under five seconds. The entire environment (exact Python version, exact Postgres version, exact Redis version, correct config) is encoded in files that live in version control alongside the application code.

When you upgrade from PostgreSQL 15 to 16, you change one line in docker-compose.yml and commit it. Every developer on the team gets the upgrade on their next pull. No announcement in Slack. No "make sure you've upgraded your local Postgres" in the README.

This is Docker's most underrated value for teams. The environment becomes a first-class artifact of the project, versioned, reproducible, and reviewable just like the code.

What You Now Understand

The phrase "it works on my machine" describes a configuration management failure, not a code problem. Docker's answer is to make the environment part of the artifact, not documented alongside it but included inside it.

That's the shift. Not "write better setup instructions." Bundle the setup into something that runs.

Your concrete next step: take an existing project, a simple web app or a script, and write a Dockerfile for it. Don't worry about making it production-perfect. Get it building and running locally. The feedback loop of seeing what breaks and why is how this becomes intuition rather than theory.

Once you're comfortable with that, the natural next steps are multi-stage builds for smaller production images, volumes for stateful workloads, and eventually Kubernetes for running containers across multiple machines. But those are layers built on top of this foundation. Get the foundation first.

DEV Community