Great Stack to Doesn't Work — Bonus
10 Docker Production Traps
Your Dockerfile works on your machine. Here's why it breaks everywhere else.
1. Your image is 2 GB because you're not using multi-stage builds.
Every RUN command creates a layer. If you install build tools, compile your app, and leave the build tools in the final image, you're shipping a toolbox alongside your application.
# Build stage
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Production stage
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]
The build tools stay in the builder stage. The final image only has what it needs to run.
2. Your layer cache invalidates every build because COPY order is wrong.
Docker caches layers. If a layer hasn't changed, Docker reuses it. But layers are sequential — if layer 3 changes, layers 4, 5, 6 all rebuild.
# BAD: code changes invalidate npm install
COPY . .
RUN npm ci
# GOOD: dependencies cached separately from code
COPY package*.json ./
RUN npm ci
COPY . .
Your code changes every build. Your package.json changes occasionally. Copy the dependency manifest first, install, then copy the code. Now dependency installation is cached unless the manifest actually changes.
3. You're running as root.
Default Docker containers run as root. If an attacker exploits your application, they have root access inside the container. With certain volume mounts or misconfigurations, that can mean root on the host.
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
Two lines. Massive security improvement.
4. You don't have a .dockerignore file.
Without .dockerignore, COPY . . sends everything to the Docker daemon: node_modules, .git, .env files, test fixtures, IDE configs. Slower builds, larger context, and potential secrets leaked into the image.
node_modules
.git
.env
*.md
test/
coverage/
.DS_Store
5. You're not using health checks.
Docker doesn't know if your application is healthy. It knows if the process is running. A Node.js server stuck in an infinite loop? Process is running. Docker says healthy.
HEALTHCHECK --interval=30s --timeout=3s --retries=3 \
CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
Now Docker can detect unhealthy containers and orchestrators can restart them.
6. Your logs disappear because you're writing to a file.
Docker captures stdout and stderr. If your application writes logs to /var/log/app.log, Docker's logging driver never sees them. docker logs returns nothing. Your centralized logging system collects nothing.
Log to stdout. Let Docker handle routing. Use a logging driver (json-file, fluentd, gelf) to send logs wherever they need to go.
7. You're using latest tag in production.
FROM node:latest means a different image every time someone builds. What worked last week might break today because latest moved to a new version. Pin your versions: FROM node:20.11-alpine.
Same for your own images. Never deploy myapp:latest to production. Use commit hashes or semantic versions: myapp:1.4.2 or myapp:abc123.
8. Your volume permissions break when switching between Linux and Mac.
Files created inside a container often have root ownership. When mounted to a Mac via Docker Desktop, this might work fine. On Linux, it breaks because your host user can't read root-owned files.
Set ownership explicitly in your Dockerfile or use --user flags to match host user IDs.
9. You're not setting memory limits.
Without memory limits, one container can consume all host memory and trigger the OOM killer, taking down other containers with it.
docker run --memory=512m --memory-swap=512m myapp
In Kubernetes, this maps to resource limits. Set them. Always.
10. You're rebuilding when you should be restarting.
Not every configuration change requires a new image. Environment variables, mounted config files, and feature flags can change at runtime. If you're rebuilding and redeploying because someone changed a log level, your deployment pipeline is doing too much work.
Separate build-time decisions (code, dependencies, base image) from run-time decisions (config, secrets, feature flags). Build less. Deploy smarter.
Over to You
Which Docker trap cost you the most debugging time? Any production Docker disaster stories you can share?
If you enjoyed this, I write about production engineering, AI systems, and the messy reality of building software at scale.
Follow me:
This is part of the **Great Stack to Doesn't Work* series — a survival guide for when everything goes wrong in production. Follow the series to catch every episode.*
Top comments (0)