Your Docker image is probably too big.
A default Node.js image with npm install is 1.2GB. A Python image with pip dependencies hits 900MB easily. A Java image with Maven can exceed 1.5GB.
These bloated images mean:
- Slower deployments — pulling 1.2GB vs 48MB across your cluster
- More CVEs — every extra package is an attack surface
- Higher costs — storage, bandwidth, and registry fees add up
- Longer CI pipelines — building, pushing, and scanning large images
Multi-stage builds solve all of this. One Dockerfile. Multiple stages. The final image contains only what your application needs to run — nothing else.
How Multi-Stage Builds Work
A multi-stage Dockerfile has multiple FROM statements. Each FROM creates a new stage. You can copy artifacts from one stage to another, leaving build dependencies behind.
# Stage 1: BUILD (contains compilers, dev tools, source code)
FROM node:20 AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build
# Stage 2: RUNTIME (contains only compiled app + runtime)
FROM node:20-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
COPY package*.json ./
EXPOSE 3000
CMD ["node", "dist/index.js"]
What happens:
- Stage 1 installs all dependencies (including devDependencies), compiles TypeScript, and builds the app
- Stage 2 starts from a clean Alpine image and copies ONLY the compiled output
- Build tools, source code, devDependencies — none of it exists in the final image
Real Examples: Before vs After
Node.js (Express API)
# ❌ BEFORE: Single stage (1.1GB)
FROM node:20
WORKDIR /app
COPY . .
RUN npm install
EXPOSE 3000
CMD ["node", "src/index.js"]
# ✅ AFTER: Multi-stage (148MB)
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
FROM node:20-alpine
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/src ./src
COPY --from=builder /app/package.json ./
USER appuser
EXPOSE 3000
CMD ["node", "src/index.js"]
Key optimizations:
-
node:20-alpineinstead ofnode:20— Alpine Linux is 5MB vs 150MB -
npm ci --only=production— no devDependencies in the final image - Non-root user — security best practice
- Only
src/,node_modules/, andpackage.jsoncopied — no.git, tests, docs
Python (FastAPI)
# ❌ BEFORE: Single stage (920MB)
FROM python:3.12
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0"]
# ✅ AFTER: Multi-stage (85MB)
# Stage 1: Build dependencies in full Python image
FROM python:3.12-slim AS builder
WORKDIR /app
RUN python -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Stage 2: Runtime with only the virtualenv
FROM python:3.12-slim
RUN groupadd -r appgroup && useradd -r -g appgroup appuser
WORKDIR /app
COPY --from=builder /opt/venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
COPY app/ ./app/
USER appuser
EXPOSE 8000
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Why virtualenv in Docker? It creates a clean, isolated directory of all Python dependencies. You copy that single directory to the runtime stage — no pip, no build headers, no cache.
Go (API Server)
Go produces static binaries. This is where multi-stage builds really shine:
# ✅ Go multi-stage (12MB final image!)
# Stage 1: Build the binary
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server ./cmd/server
# Stage 2: Scratch image (literally empty)
FROM scratch
COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
12MB. The scratch image is completely empty — no shell, no OS, no package manager. Just your binary and TLS certificates. This is the smallest possible attack surface.
Java (Spring Boot)
# ✅ Java multi-stage (180MB, down from 700MB)
# Stage 1: Build with Maven
FROM maven:3.9-eclipse-temurin-21 AS builder
WORKDIR /app
COPY pom.xml .
RUN mvn dependency:go-offline # Cache dependencies
COPY src/ ./src/
RUN mvn package -DskipTests -q
# Stage 2: Extract Spring Boot layers
FROM eclipse-temurin-21-jre-alpine AS extractor
WORKDIR /app
COPY --from=builder /app/target/*.jar app.jar
RUN java -Djarmode=layertools -jar app.jar extract
# Stage 3: Runtime with layered JARs
FROM eclipse-temurin-21-jre-alpine
RUN addgroup -S app && adduser -S app -G app
WORKDIR /app
COPY --from=extractor /app/dependencies/ ./
COPY --from=extractor /app/spring-boot-loader/ ./
COPY --from=extractor /app/snapshot-dependencies/ ./
COPY --from=extractor /app/application/ ./
USER app
EXPOSE 8080
ENTRYPOINT ["java", "org.springframework.boot.loader.launch.JarLauncher"]
Three stages. The Spring Boot layer extraction (stage 2) separates dependencies from application code. Docker caches the dependency layer — so when only your code changes, the rebuild copies just the application layer ( usually <1MB). This makes subsequent builds extremely fast.
Optimization Techniques
1. Order COPY statements by change frequency
Docker caches each layer. When a layer changes, all subsequent layers are invalidated. Put rarely-changing files first:
# ✅ Dependencies change rarely → cached
COPY package*.json ./
RUN npm ci
# Source code changes often → rebuilt
COPY src/ ./src/
2. Use .dockerignore
A file npm install copies your entire project. Without .dockerignore, you're sending .git/, node_modules/, test files, and docs to the Docker daemon.
# .dockerignore
.git
.gitignore
node_modules
npm-debug.log
Dockerfile
docker-compose.yml
.env
*.md
tests/
coverage/
.vscode/
3. Pin exact versions
# ❌ Breaks randomly when base image updates
FROM node:latest
# ❌ Breaks when minor version changes
FROM node:20
# ✅ Predictable, reproducible builds
FROM node:20.11.1-alpine3.19
4. Minimize layers
Each RUN command creates a layer. Combine related commands:
# ❌ 3 layers
RUN apt-get update
RUN apt-get install -y curl
RUN rm -rf /var/lib/apt/lists/*
# ✅ 1 layer
RUN apt-get update && \
apt-get install -y --no-install-recommends curl && \
rm -rf /var/lib/apt/lists/*
5. Security: Never run as root
# Create a non-root user
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser
Every container should run as a non-root user. If an attacker exploits your application, they don't get root access to the container (or potentially the host).
Scanning Your Images
A small image with known CVEs is still a vulnerable image. Scan after building:
# Trivy — free, fast, comprehensive
trivy image myapp:v1.0.0
# Docker Scout (built into Docker Desktop)
docker scout cves myapp:v1.0.0
# Grype by Anchore
grype myapp:v1.0.0
Integrate scanning into CI:
# GitHub Actions step
- name: Scan Docker image
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
severity: 'CRITICAL,HIGH'
exit-code: '1' # Fail the build if critical CVEs found
Size Comparison Summary
Language | Before | After | Reduction
-------------|-----------------|-----------------|----------
Node.js | 1,100 MB | 148 MB | 87%
Python | 920 MB | 85 MB | 91%
Go | 850 MB | 12 MB | 99%
Java | 700 MB | 180 MB | 74%
The effort: restructuring one Dockerfile. The payoff: faster deployments, fewer vulnerabilities, lower costs — permanently.
When NOT to Use Multi-Stage Builds
- Development environments. In dev, you want hot-reload, debuggers, and full source code. Use a single-stage Dockerfile for development and multi-stage for production.
-
Debugging production issues. Sometimes you need
curl,sh, orstracein the container. Use a debug image (alpineinstead ofscratch) when troubleshooting.
Every Dockerfile in your production pipeline should be multi-stage. It's one of those rare optimizations that improves performance, security, and cost simultaneously.
What's the smallest Docker image you've built? Share your Dockerfile tricks in the comments.
Follow me for more container and DevOps content.
Top comments (0)