DEV Community

Suleiman Dibirov
Suleiman Dibirov

Posted on • Edited on

Dockerfile Best Practices: How to Create Efficient Containers

Introduction

In the era of microservices and cloud computing, Docker has become an indispensable tool for application development and deployment. Containerization allows developers to package applications and their dependencies into a single, portable unit, ensuring predictability, scalability, and rapid deployment. However, the efficiency of your containers largely depends on how optimally your Dockerfile is written.

In this article, we'll explore best practices for creating Dockerfiles that help you build lightweight, fast, and secure containers.

Dockerfile Basics

What Is a Dockerfile?

A Dockerfile is a text document containing a set of instructions to assemble a Docker image. Each instruction performs a specific action, such as installing packages, copying files, or defining startup commands. Proper use of Dockerfile instructions is crucial for building efficient containers.

Key Dockerfile Instructions

  • FROM: Sets the base image for your new image.
  • RUN: Executes a command in a new layer on top of the current image and commits the result.
  • CMD: Specifies the default command to run when a container is started.
  • COPY: Copies files and directories from the build context into the container filesystem.
  • ADD: Similar to COPY but with additional features like extracting archives.
  • ENV: Sets environment variables.
  • EXPOSE: Informs Docker which ports the container listens on at runtime.
  • ENTRYPOINT: Configures a container to run as an executable.
  • VOLUME: Creates a mount point for external storage volumes.
  • WORKDIR: Sets the working directory for subsequent instructions.

Best Practices for Writing Dockerfiles

Use Minimal Base Images

The base image serves as the foundation for your Docker image. Choosing a lightweight base image can significantly reduce the final image size and minimize the attack surface.

  • Alpine Linux: A popular minimal image around 5 MB in size.
  FROM alpine:latest
Enter fullscreen mode Exit fullscreen mode

Pros: Small size, security, fast downloads.

Cons: May require additional configuration; some packages might be missing or behave differently due to using musl instead of glibc.

  • Scratch: An empty image ideal for languages that can compile static binaries (Go, Rust).
  FROM scratch
  COPY myapp /myapp
  CMD ["/myapp"]
Enter fullscreen mode Exit fullscreen mode

Reduce Layers

Each RUN, COPY, and ADD instruction adds a new layer to your image. Combining commands helps reduce the number of layers and the overall image size.

Inefficient:

RUN apt-get update
RUN apt-get install -y python
RUN apt-get install -y pip
Enter fullscreen mode Exit fullscreen mode

Efficient:

RUN apt-get update && apt-get install -y \
    python \
    pip \
 && rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode

Optimize Layer Caching

Docker uses layer caching to speed up builds. The order of instructions affects caching efficiency.

  • Copy Dependency Files First: Copy files that change less frequently (like package.json or requirements.txt) before copying the rest of the source code.
  COPY package.json .
  RUN npm install
  COPY . .
Enter fullscreen mode Exit fullscreen mode
  • Minimize Changes in Early Layers: Changes in early layers invalidate the cache for all subsequent layers.

Install Dependencies Wisely

Remove temporary files and caches after installing packages to reduce image size.

RUN pip install --no-cache-dir -r requirements.txt
Enter fullscreen mode Exit fullscreen mode

Manage Secrets Carefully

Never include sensitive data (passwords, API keys) in your Dockerfile.

  • Use Environment Variables: Pass secrets at runtime using environment variables.
  • Leverage Docker Secrets: Use Docker Swarm or Kubernetes mechanisms for managing secrets.

Optimize Image Size

  • Delete Unnecessary Files: Clean up caches, logs, and temporary files within the same RUN command as the installation. This ensures that these temporary files do not persist in any intermediate layers, effectively reducing the final image size.
  RUN apt-get update && apt-get install -y --no-install-recommends package \
      && apt-get clean && rm -rf /var/lib/apt/lists/*
Enter fullscreen mode Exit fullscreen mode
  • Minimize Installed Packages: Install only the packages you need by using flags like --no-install-recommends. This avoids pulling in unnecessary dependencies, further slimming down the image.
  RUN apt-get install -y --no-install-recommends package
Enter fullscreen mode Exit fullscreen mode

Note: To maximize image size reduction, combine this installation with cleanup commands in the same RUN statement as shown above.

  • Use Optimization Tools: Utilize tools like Docker Slim which can automatically analyze and optimize your Docker images by removing unnecessary components and reducing their size without altering functionality.

Utilize .dockerignore

A .dockerignore file lets you exclude files and directories from the build context, reducing the amount of data sent to the Docker daemon and protecting sensitive information.

Example .dockerignore:

.git
node_modules
Dockerfile
.dockerignore
Enter fullscreen mode Exit fullscreen mode

Employ Multi-Stage Builds

Multi-stage builds allow you to use intermediate images and copy only the necessary artifacts into the final image.

Example for a Go Application:

# Build Stage
FROM golang:1.16-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o myapp

# Final Image
FROM alpine:latest
WORKDIR /app
COPY --from=builder /app/myapp .
CMD ["./myapp"]
Enter fullscreen mode Exit fullscreen mode

Run as Non-Root User

For enhanced security, avoid running applications as the root user.

RUN adduser -D appuser
USER appuser
Enter fullscreen mode Exit fullscreen mode

Scan for Vulnerabilities

  • Use Scanning Tools: Tools like Trivy, Anchore, or Clair can help identify known vulnerabilities.
  • Regularly Update Images: Keep your base images and dependencies up to date.

Logging and Monitoring

  • Direct Logs to STDOUT/STDERR: This allows for easier log collection and analysis.
  • Integrate with Monitoring Systems: Use tools like Prometheus or the ELK Stack to monitor container health.

Examples and Recommendations

Optimized Dockerfile Example for a Node.js Application

# Use the official Node.js image based on Alpine Linux
FROM node:14-alpine

# Set the working directory
WORKDIR /app

# Copy package files and install dependencies
COPY package*.json ./
RUN npm ci --only=production

# Copy the rest of the application code
COPY . .

# Create a non-root user and switch to it
RUN addgroup appgroup && adduser -S appuser -G appgroup
USER appuser

# Expose the application port
EXPOSE 3000

# Define the command to run the app
CMD ["node", "app.js"]
Enter fullscreen mode Exit fullscreen mode

Additional Recommendations

  • Pin Versions: Use specific versions of base images and packages to ensure build reproducibility.
  FROM node:14.17.0-alpine
Enter fullscreen mode Exit fullscreen mode
  • Stay Updated: Regularly update dependencies and base images to include security patches.
  • Use Metadata: Add LABEL instructions to provide image metadata.
  LABEL maintainer="yourname@example.com"
Enter fullscreen mode Exit fullscreen mode
  • Set Proper Permissions: Ensure files and directories have appropriate permissions.
  • Avoid Using Root: Always switch to a non-root user for running applications.

Conclusion

Creating efficient Docker images is both an art and a science. By following best practices when writing your Dockerfile, you can significantly improve the performance, security, and manageability of your containers. Continuously update your knowledge and stay informed about new tools and methodologies in the containerization ecosystem. Remember, optimization is an ongoing process, and there's always room for improvement.

Top comments (12)

Collapse
 
bence42 profile image
Bence Szabo

It'd be worth noting that a cleanup command e.g the one in the
Delete Unnecessary Files
section only reduce final image size when it's in the same RUN command as the install. That's one of the common pitfalls.

Collapse
 
idsulik profile image
Suleiman Dibirov

Thank you, you're right, updated the article to highlight it

Collapse
 
bobbyiliev profile image
Bobby Iliev

Great post! Well done!

For anyone interested in learning more about Docker in general, I could suggest this free eBook here:

GitHub logo bobbyiliev / introduction-to-docker-ebook

Free Introduction to Docker eBook

💡 Introduction to Docker

This is an open-source introduction to Docker guide that will help you learn the basics of Docker and how to start using containers for your SysOps, DevOps, and Dev projects. No matter if you are a DevOps/SysOps engineer, developer, or just a Linux enthusiast, you will most likely have to use Docker at some point in your career.

The guide is suitable for anyone working as a developer, system administrator, or a DevOps engineer and wants to learn the basics of Docker.

🚀 Download

To download a copy of the ebook use one of the following links:

📘 Chapters

Collapse
 
chibx profile image
Chiebidolu Chinaemerem

Great suggestion.
Haven't read through it thoroughly, but I love the layout, and the table of content is a beauty

Collapse
 
amrikasir profile image
Al Amrikasir

The timing is insane, I read this article when I need to optimize docker image.

Thanks a lot man, you just save me 🙏

Collapse
 
denys_bochko profile image
Denys Bochko

very nice, thanks. there are a couple of things I did not know about, will be restructuring my dockerfiles

Collapse
 
regisnew profile image
Rodrigo Régis Palmeira

Congratulations. Very good examples. Thanks for sharing.

Collapse
 
lovelindhoni profile image
Lovelin

A great post fr. Thanks man, much needed

Collapse
 
indomie profile image
Kentang Balado

Nice post!.
Thanks for sharing.

Collapse
 
click2install profile image
click2install

Use a linter like hadolint, simple.

Collapse
 
robertomaurizzi profile image
Roberto Maurizzi • Edited

Write about caching intermediate files to speed up builds and saving on download time/bytes (nothing better than having to work on a 50kB/s connection in a developing country to learn that...)

Collapse
 
redev1l profile image
Kirill Naumenko

You dont do npm ci in dockerfile, because you cant get env vars dynamically from ur CI engine, without scripting.
You build your app in CI environment and then copy artifacts to dockerfile.