DEV Community

Cover image for Stop Writing Bad Dockerfiles: Production-Ready Best Practices That Actually Work
Teguh Coding
Teguh Coding

Posted on

Stop Writing Bad Dockerfiles: Production-Ready Best Practices That Actually Work

Most Dockerfiles are a disaster waiting to happen.

I inherited a project last year with a Dockerfile that took 12 minutes to build. Twelve. Minutes. For a simple Node.js API. The final image was 1.8GB. There were five different :latest tags scattered throughout. And it ran as root.

That project taught me everything about what NOT to do. Here are the Dockerfile best practices that will save you from those mistakes.

The Problem with Typical Dockerfiles

Most developers write Dockerfiles like this:

FROM node:latest
COPY . .
RUN npm install
CMD ["node", "index.js"]
Enter fullscreen mode Exit fullscreen mode

This looks fine until you realize it:

  • Rebuilds everything on every code change
  • Includes your node_modules in the image
  • Uses an unpredictable Node version
  • Has no error handling
  • Runs as root

Let's fix this.

1. Always Pin Specific Versions

Never use :latest. When you build next month, you might get a completely different base image with breaking changes.

# Bad
FROM node:latest

# Good - specific version
FROM node:20.11.0-alpine3.19
Enter fullscreen mode Exit fullscreen mode

The alpine variant is smaller (about 5MB vs 900MB+). For production, that's a big deal.

2. Use .dockerignore (Seriously)

You'd be amazed what's in your image that shouldn't be.

# .dockerignore
node_modules
npm-debug.log
.git
.gitignore
README.md
.env
.env.*
*.md
dist
coverage
.vscode
.idea
docker-compose*.yml
Dockerfile
Enter fullscreen mode Exit fullscreen mode

That .env file with your production secrets? If you COPY . . without .dockerignore, it's in your image. I've seen this cause real security breaches.

3. Order Matters for Layer Caching

Docker builds layers. Each instruction creates a new layer. Docker caches unchanged layers, which makes rebuilds fast. Structure your Dockerfile so frequently changing steps come last:

# Better ordering - dependencies change less often
FROM node:20.11.0-alpine3.19

WORKDIR /app

# Copy package files FIRST
COPY package*.json ./

# Install dependencies (cached unless package.json changes)
RUN npm ci --only=production

# Copy source code LAST (changes every build)
COPY . .

# Expose and run
EXPOSE 3000
CMD ["node", "index.js"]
Enter fullscreen mode Exit fullscreen mode

This way, changing index.js doesn't trigger a new npm install.

4. Use Multi-Stage Builds for Smaller Images

Your dev dependencies don't belong in production.

# Build stage
FROM node:20.11.0-alpine3.19 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:20.11.0-alpine3.19 AS production
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./

USER node
EXPOSE 3000
CMD ["node", "dist/index.js"]
Enter fullscreen mode Exit fullscreen mode

This produces a lean production image without TypeScript, test frameworks, or build tools.

5. Never Run as Root

This is a security must. Running as root means if an attacker compromises your container, they have root on the host.

# Create a non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

# Change ownership of files
CHOWN nodejs:nodejs /app

# Switch to non-root user
USER nodejs
Enter fullscreen mode Exit fullscreen mode

Then in your Compose file:

services:
  app:
    user: "1001:1001"
Enter fullscreen mode Exit fullscreen mode

6. Use HEALTHCHECK

Your orchestrator needs to know if your container is actually healthy.

FROM node:20.11.0-alpine3.19

WORKDIR /app
COPY package*.json ./
RUN npm ci --production
COPY . .

EXPOSE 3000

# Health check - curl is lightweight enough for alpine
RUN apk add --no-cache curl
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD curl -f http://localhost:3000/health || exit 1

CMD ["node", "index.js"]
Enter fullscreen mode Exit fullscreen mode

Now Kubernetes or Docker Compose knows when your app is actually ready to serve traffic.

7. Understand the Difference Between RUN, CMD, and ENTRYPOINT

This confuses everyone:

  • RUN: Executes during build (creates layer)
  • CMD: Default command when container starts (can be overridden)
  • ENTRYPOINT: Defines the actual executable (arguments are appended)
# CMD - can be overridden at runtime
CMD ["node", "index.js"]
# docker run myapp custom.js runs custom.js

# ENTRYPOINT - treats arguments as parameters
ENTRYPOINT ["node"]
CMD ["index.js"]
# docker run myapp debug.js runs node debug.js
Enter fullscreen mode Exit fullscreen mode

For most Node.js apps, CMD is what you want.

8. Handle Signals Properly

Docker sends SIGTERM first, then SIGKILL after 10 seconds. Your app needs to handle graceful shutdown.

// In your Node.js app
const server = http.createServer(app);

const shutdown = () => {
  console.log('Received shutdown signal, closing server...');
  server.close(() => {
    console.log('Server closed');
    process.exit(0);
  });

  // Force exit after timeout
  setTimeout(() => {
    console.error('Forced shutdown');
    process.exit(1);
  }, 10000);
};

process.on('SIGTERM', shutdown);
process.on('SIGINT', shutdown);
Enter fullscreen mode Exit fullscreen mode

Putting It All Together

Here's a production-ready Dockerfile:

# Build stage
FROM node:20.11.0-alpine3.19 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:20.11.0-alpine3.19 AS production
WORKDIR /app

# Security: create non-root user
RUN addgroup -g 1001 -S nodejs && \
    adduser -S nodejs -u 1001

# Copy built artifacts
COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist
COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules
COPY --from=builder --chown=nodejs:nodejs /app/package.json ./

# Switch to non-root user
USER nodejs

EXPOSE 3000

# Health check
HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
  CMD node -e "require('http').get('http://localhost:3000/health', (r) => process.exit(r.statusCode === 200 ? 0 : 1))"

CMD ["node", "dist/index.js"]
Enter fullscreen mode Exit fullscreen mode

The Bottom Line

A good Dockerfile isn't about using all the tricks. It's about:

  1. Security: Non-root user, no secrets in image
  2. Speed: Proper layer caching, multi-stage builds
  3. Reliability: Health checks, graceful shutdown
  4. Maintainability: Specific versions, clear structure

Start with these practices. Your builds will be faster, your images smaller, and your production incidents fewer.

What Dockerfile nightmares have you encountered? Drop them in the comments.

Top comments (0)