Rakshyak Satpathy

Posted on Apr 12

"Works on My Machine" Is Not a Dev Strategy

#docker

New dev joins the team. Day 1: can't run the app 😅 Day 2: still can't run the app 💀 Day 3: senior dev drops everything to help 🫠 Sound familiar? I fixed this with Docker Compose and I'm never going back. Here's the full breakdown 🐳🔥

How I stopped losing half my day to environment issues and gave our team a production-like local stack that just works.

The Setup
I work on a property management system — the kind of software that hotels, hostels, and rental operators use to manage bookings, guests, room inventory, and payments. On the surface it sounds like a straightforward CRUD app. But in practice, a PMS is a web of interdependent services: a reservation API, a worker that processes booking events, a cache layer, a notification system, and a relational database holding years of critical guest and room data.
For a long time, running this system locally meant one thing: pain.

Part 1 — Why We Needed to Change

The "Works on My Machine" Graveyard

Every team reaches a point where the phrase "it works on my machine" stops being a joke and starts being a root cause in post-mortems. We reached that point.

Our team had three engineers, three different operating systems, and no agreed-upon way to run the stack locally. One person was on macOS with Node 18, another on Ubuntu with Node 20, and a third on Windows with WSL2. Dependency versions drifted silently. Environment variables were passed around in Slack DMs. npm install on a fresh clone would fail with cryptic native module errors that took an hour to debug — not because the code was wrong, but because the environment was wrong.

This is not a team problem. It is a systems problem. And it compounds fast.

The True Cost of Environment Inconsistency

Let's be honest about what we were actually losing. Every new engineer onboarding spent the better part of a day — sometimes two — just getting the app to run locally. Senior engineers would pair with them, pulling them away from actual feature work. The entire overhead was invisible in sprint planning because nobody tracked it as a cost.

Beyond onboarding, there was the subtler daily tax: environment-related bugs. A query that worked locally against a developer's personal Postgres instance would behave differently against staging because the database versions didn't match. Redis cache behavior differed between developers because some had it installed globally, some via Homebrew, and one was mocking it entirely. We were not testing the application; we were testing our individual setups.

Why a PMS Specifically Needs This Solved

A property management system is not a single service. It is at minimum:

A reservation API handling room availability, booking creation, and guest management
A worker service processing booking events — sending confirmation emails, updating inventory, triggering payment captures
A PostgreSQL database storing rooms, bookings, guests, pricing rules, and audit logs
A Redis instance acting as a job queue and session cache
A reverse proxy routing external requests to the right internal service

These components have hard dependencies on each other. The API cannot start without the database being ready. The worker cannot function without Redis. The proxy cannot route without both application services being up. Running these independently, manually, in the correct order, every single time a developer sits down to work — is a coordination problem that scales terribly.

The solution is not discipline. The solution is removing the coordination problem entirely.

Part 2 — The Problem Statement

What We Actually Needed

Before reaching for a tool, it is worth being precise about the problem.

We needed local environments that were:

Reproducible — same behavior across macOS, Linux, and Windows
Isolated — no dependency on globally installed software
Ordered — services that start in the correct sequence, automatically
Realistic — close enough to production that bugs caught locally stay caught
Fast to start — from a fresh clone to a running stack in under two minutes

We did not have any of this. What we had was a collection of README sections titled "Prerequisites" that slowly fell out of date, a global Postgres installation that lived only on one developer's machine, and a habit of keeping services running in background terminal tabs and praying nobody rebooted.

The Specific Failure Modes We Kept Hitting

Race conditions on startup. The API would try to connect to Postgres before Postgres had finished initializing. The fix was manually waiting, which meant either a sleep 5 in a shell script or just re-running the API and hoping. Neither is acceptable.

Seed data drift. Each developer had a slightly different local database state. Testing the same feature against different data produced different results, making it impossible to reproduce bugs reliably. "Send me your database dump" became a recurring request.

No queue visibility. The worker service processed jobs from Redis. Locally, most developers skipped Redis entirely and mocked the queue. This meant an entire class of bugs — failed retries, duplicate processing, stale jobs — was completely invisible until staging.

Onboarding friction as a product problem. When a new engineer cannot contribute for two days because of environment setup, that is not just a developer experience issue. It is a product velocity issue. It delays features, drains senior engineering time, and signals to new team members that the codebase is harder to work with than it needs to be.

Part 3 — The Solution and Migration Approach

Why Docker Compose

Docker Compose solves exactly the class of problem described above. It lets you define your entire application stack — every service, every dependency, every network connection, every volume — in a single declarative file. Running docker compose up becomes the one command that does everything.

More importantly, it makes the environment part of the codebase. The docker-compose.yml file lives in version control alongside the application code. When the Postgres version changes, that change is tracked, reviewed, and deployed consistently to every developer the next time they pull.

The Target Architecture

Before writing a single line of configuration, it helps to have a clear picture of what you are building.

The stack has five core services and two optional observability services:

Nginx sits at the edge, receiving traffic on port 80 and routing it to the appropriate internal service
Next.js API handles all reservation, room, and guest operations
Booking Worker is a Node.js process that consumes jobs from a Redis queue and handles side effects — emails, inventory updates, payment triggers
PostgreSQL is the source of truth for all relational data, seeded with realistic room and booking fixtures
Redis serves two roles: session cache for the API and job queue for the worker
Adminer is a lightweight database UI exposed only in development
Prometheus + Grafana form an optional observability layer for developers who want to understand service behavior under load

The Repository Structure

Keeping the project navigable matters. Here is the structure that works well:

pms-local-dev/
├── docker-compose.yml          # development stack
├── docker-compose.prod.yml     # production overrides
├── .env.example                # all variables documented, no secrets committed
├── Makefile                    # human-friendly command aliases
├── nginx/
│   └── nginx.conf              # routing rules, rate limiting
├── services/
│   ├── api/
│   │   ├── Dockerfile          # multi-stage: builder → runner
│   │   └── ...                 # Next.js application
│   └── worker/
│       ├── Dockerfile
│       └── ...                 # event processor
└── db/
    ├── seed.sql                # rooms, booking types, guest schema
    └── migrations/             # version-controlled schema changes

Step 1 — Write Multi-Stage Dockerfiles

The single biggest mistake in containerizing a Node.js application is treating the development image and the production image as the same thing. They are not.

A development image needs devDependencies, source maps, and a file watcher for hot-reload. A production image needs none of that. Shipping a production image with everything included is how you end up with a 900MB container that takes three minutes to pull in CI.

Multi-stage builds solve this cleanly:

# Stage 1: install all dependencies and build
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 2: production image — only what runs
FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production
COPY --from=builder /app/.next ./.next
COPY --from=builder /app/public ./public
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/package.json ./package.json
EXPOSE 3000
CMD ["npm", "start"]

In development, you override the command via Compose to run npm run dev and mount the source directory as a bind volume. The container runs, the watcher picks up file changes, and hot-reload works exactly as it would outside Docker — without the environment fragmentation.

Step 2 — Write the Compose File with Health Checks

This is the core of the solution. The key insight is the depends_on condition. By attaching a health check to each dependency service and using condition: service_healthy in the dependent service, you eliminate the startup race condition entirely. Postgres will be accepting connections before the API attempts its first query.

services:
  postgres:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - pg_data:/var/lib/postgresql/data
      - ./db/seed.sql:/docker-entrypoint-initdb.d/seed.sql
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER}"]
      interval: 5s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 5s
      timeout: 3s
      retries: 5

  api:
    build:
      context: ./services/api
      target: builder # use builder stage in dev for hot-reload
    volumes:
      - ./services/api:/app # bind mount for live code changes
      - /app/node_modules # anonymous volume prevents host modules overwriting container
    environment:
      DATABASE_URL: postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
      REDIS_URL: redis://redis:6379
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    command: npm run dev

  worker:
    build:
      context: ./services/worker
    environment:
      DATABASE_URL: postgres://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
      REDIS_URL: redis://redis:6379
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy

  nginx:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
    depends_on:
      - api

  adminer:
    image: adminer
    ports:
      - "8080:8080"
    depends_on:
      - postgres

volumes:
  pg_data:

A few things worth noting here. The pg_data named volume means your database persists across docker compose down and docker compose up cycles — you do not lose seed data every time you restart. The anonymous volume for node_modules is a subtlety that trips people up: without it, the host's node_modules (or lack thereof) would overwrite the container's, breaking native modules compiled inside the container.

Step 3 — Handle the Dev vs Production Split

You do not want Adminer, bind mounts, and the builder stage target in production. The cleanest way to handle this is a Compose override file.

docker-compose.prod.yml extends the base file and applies production-specific values:

services:
  api:
    build:
      target: runner # use the lean production stage
    volumes: [] # no bind mounts
    command: npm start
    restart: unless-stopped

  worker:
    restart: unless-stopped

  adminer:
    profiles:
      - dev # only runs if explicitly activated

In production or CI, you run:

docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d

In development, you just run docker compose up. The override pattern keeps both worlds maintainable in the same repository without duplication.

Step 4 — Add a Makefile for Developer Ergonomics

Nobody should have to remember docker compose -f docker-compose.yml -f docker-compose.prod.yml up --build -d. A Makefile wraps the commands you actually use:

up:
    docker compose up --build

down:
    docker compose down

logs:
    docker compose logs -f

seed:
    docker compose exec postgres psql -U $$POSTGRES_USER -d $$POSTGRES_DB -f /docker-entrypoint-initdb.d/seed.sql

reset:
    docker compose down -v && docker compose up --build

shell-api:
    docker compose exec api sh

shell-db:
    docker compose exec postgres psql -U $$POSTGRES_USER -d $$POSTGRES_DB

Now onboarding is:

git clone <repo>
cp .env.example .env
make up

That is it. Three commands. No prerequisite installations. No version mismatches. No Slack DMs about which version of Postgres to use.

Step 5 — The Migration Path for Existing Teams

If you are retrofitting an existing PMS codebase rather than starting fresh, the migration does not need to happen all at once.

Week 1 — containerize the database only. Get Postgres and Redis running in Docker while keeping the application running natively. This alone solves the most common source of environment drift and is the lowest-risk change. Developers connect to localhost:5432 as before — nothing about the application changes.

Week 2 — containerize the worker. The worker is usually stateless and has no hot-reload requirements, making it the easiest service to containerize fully. Move it into Docker and verify it processes jobs from the Redis container correctly.

Week 3 — containerize the API. This is the most involved step. Set up the multi-stage Dockerfile, configure bind mounts for hot-reload, and verify the development experience matches what developers had before. Test on all three operating systems your team uses.

Week 4 — add Nginx and the Makefile. Complete the stack with the reverse proxy, verify the full startup sequence with health checks, document the commands in the Makefile, and update the README.

This incremental approach means you can stop at any step if something is blocking, and you get value from each step independently. The full stack is the goal, but you do not need to pause feature work to get there.

DEV Community