DEV Community

Cover image for Docker Storage & Volumes Internals: Understanding Layers, OverlayFS & Persistence
Sreekanth Kuruba
Sreekanth Kuruba

Posted on

Docker Storage & Volumes Internals: Understanding Layers, OverlayFS & Persistence

Docker storage often feels like a black box — images appear, containers run, disk space disappears, and volumes behave magically.

This post explains how Docker actually stores data on disk.


1. How Docker Images Are Stored (The Layered Filesystem)

Docker uses union filesystem concepts (implemented via OverlayFS) to build images efficiently.

Each instruction in your Dockerfile creates a new read-only layer.

  • Most RUN, COPY, and ADD instructions create filesystem layers
  • Instructions like ENV, LABEL, CMD, and EXPOSE only add metadata (no filesystem layer)
  • Every layer has a unique SHA256 ID
  • Layers are immutable and shared across images (deduplication)

Key Idea:
Docker images are stacks of immutable layers, not a single filesystem.


2. Storage Driver: overlay2 (Default on Linux)

Modern Docker uses the overlay2 storage driver, based on the Linux kernel’s OverlayFS (a union filesystem implementation).

How OverlayFS Works

OverlayFS combines multiple directories into a single unified view:

  • lowerdir → Read-only image layers
  • upperdir → Writable container layer
  • merged → Unified filesystem view inside the container
  • workdir → Internal working directory required by OverlayFS

Important Behavior

The container never modifies image layers directly.
Instead, it uses copy-on-write — when a file is modified, it is copied into the writable layer first.


3. What Happens When a Container Starts?

When you run a container:

  1. Docker mounts all image layers (read-only)
  2. It creates a thin writable layer on top
  3. The container sees a single unified filesystem (merged view)
  4. All changes go into the writable layer

Container Lifecycle Note

When the container is deleted:

  • Its writable layer is removed
  • Image layers remain unchanged and reusable

However:

  • Data stored in volumes or bind mounts persists independently

4. Volumes vs Bind Mounts

Feature Volumes Bind Mounts
Management Docker-managed User-managed
Location /var/lib/docker/volumes Any host path
Portability High Low
Performance More consistent across platforms Depends on host OS/filesystem
Best Use Case Production, databases Development, live code sync

Recommendation:

  • Use Volumes for production and persistent data
  • Use Bind Mounts for development workflows

5. Why Docker Consumes So Much Disk Space

Docker disk usage grows mainly due to:

  • Unused images and dangling layers
  • Stopped containers with writable layers
  • Orphaned volumes
  • Build cache (including BuildKit cache layers, often the largest hidden consumer)

Useful Commands

# Show overall usage
docker system df

# Detailed breakdown
docker system df -v

# Clean unused data
docker system prune -a --volumes
Enter fullscreen mode Exit fullscreen mode

⚠️ Be careful: prune commands permanently delete unused data.


6. Best Practices for Storage

  • Use volumes for persistent and stateful data
  • Avoid storing important data inside container writable layers
  • Use multi-stage builds to reduce image size
  • Clean unused resources regularly
  • Monitor disk usage with docker system df
  • Minimize unnecessary files inside images (logs, caches, temp files)

Summary

Docker storage is built on three core principles:

  • Immutable image layers (shared and reusable)
  • Copy-on-write writable container layer
  • External persistence via volumes and bind mounts

Understanding this model helps you:

  • Reduce disk usage
  • Debug storage issues faster
  • Design efficient container architectures
  • Avoid accidental data loss

Next Topic

Docker Security Internals: Namespaces, cgroups, Capabilities, Rootless Docker & Best Practices


Top comments (0)