Docker storage often feels like a black box — images appear, containers run, disk space disappears, and volumes behave magically.
This post explains how Docker actually stores data on disk.
1. How Docker Images Are Stored (The Layered Filesystem)
Docker uses union filesystem concepts (implemented via OverlayFS) to build images efficiently.
Each instruction in your Dockerfile creates a new read-only layer.
- Most
RUN,COPY, andADDinstructions create filesystem layers - Instructions like
ENV,LABEL,CMD, andEXPOSEonly add metadata (no filesystem layer) - Every layer has a unique SHA256 ID
- Layers are immutable and shared across images (deduplication)
Key Idea:
Docker images are stacks of immutable layers, not a single filesystem.
2. Storage Driver: overlay2 (Default on Linux)
Modern Docker uses the overlay2 storage driver, based on the Linux kernel’s OverlayFS (a union filesystem implementation).
How OverlayFS Works
OverlayFS combines multiple directories into a single unified view:
- lowerdir → Read-only image layers
- upperdir → Writable container layer
- merged → Unified filesystem view inside the container
- workdir → Internal working directory required by OverlayFS
Important Behavior
The container never modifies image layers directly.
Instead, it uses copy-on-write — when a file is modified, it is copied into the writable layer first.
3. What Happens When a Container Starts?
When you run a container:
- Docker mounts all image layers (read-only)
- It creates a thin writable layer on top
- The container sees a single unified filesystem (merged view)
- All changes go into the writable layer
Container Lifecycle Note
When the container is deleted:
- Its writable layer is removed
- Image layers remain unchanged and reusable
However:
- Data stored in volumes or bind mounts persists independently
4. Volumes vs Bind Mounts
| Feature | Volumes | Bind Mounts |
|---|---|---|
| Management | Docker-managed | User-managed |
| Location | /var/lib/docker/volumes |
Any host path |
| Portability | High | Low |
| Performance | More consistent across platforms | Depends on host OS/filesystem |
| Best Use Case | Production, databases | Development, live code sync |
Recommendation:
- Use Volumes for production and persistent data
- Use Bind Mounts for development workflows
5. Why Docker Consumes So Much Disk Space
Docker disk usage grows mainly due to:
- Unused images and dangling layers
- Stopped containers with writable layers
- Orphaned volumes
- Build cache (including BuildKit cache layers, often the largest hidden consumer)
Useful Commands
# Show overall usage
docker system df
# Detailed breakdown
docker system df -v
# Clean unused data
docker system prune -a --volumes
⚠️ Be careful: prune commands permanently delete unused data.
6. Best Practices for Storage
- Use volumes for persistent and stateful data
- Avoid storing important data inside container writable layers
- Use multi-stage builds to reduce image size
- Clean unused resources regularly
- Monitor disk usage with
docker system df - Minimize unnecessary files inside images (logs, caches, temp files)
Summary
Docker storage is built on three core principles:
- Immutable image layers (shared and reusable)
- Copy-on-write writable container layer
- External persistence via volumes and bind mounts
Understanding this model helps you:
- Reduce disk usage
- Debug storage issues faster
- Design efficient container architectures
- Avoid accidental data loss
Next Topic
Docker Security Internals: Namespaces, cgroups, Capabilities, Rootless Docker & Best Practices
Top comments (0)