DEV Community

Mustafa ERBAY
Mustafa ERBAY

Posted on • Originally published at mustafaerbay.com.tr

Docker Disk Fire: Root Cause Analysis on My 7.6 GB VPS

While hosting the backend of one of my side projects on a 7.6 GB VPS, I suddenly encountered Docker consuming all the disk space. This is a common problem, especially on servers with limited resources, and requires careful management. For me, this was more than just a disk full alert; it necessitated a root cause analysis that affected the system's stability.

In this post, I will explain in detail why Docker takes up so much space, how I identified it on my own VPS with concrete examples, and what steps I took for permanent solutions. My goal is to provide a practical roadmap for anyone facing similar issues.

Why Does Docker Take Up So Much Space?

Understanding Docker's disk usage is the first step in troubleshooting. There are several fundamental reasons that are often overlooked. These typically stem from the accumulation of components like images, containers, volumes, and build cache.

Accurately understanding the disk impact of each component is critical for cleaning up unnecessary space and preventing future problems. This management must be even more precise on small VPS instances.

Image Layers and Disk Usage

Docker images are built in layers. Each command in an image (RUN, COPY, ADD) creates a new layer. These layers can be shared between different images that use the same base image, but unused or "dangling" (untagged) layers continue to occupy disk space.

In my case, I had accumulated old image layers due to frequent image updates and tests. While you can see them with the docker images command, the real problem lay hidden in dangling images and the layers they created.

# List all images, find dangling ones

<figure>
  <Image src={cover} alt="Command outputs showing Docker disk usage in a Linux server terminal" />
</figure>

docker images -a | grep "<none>"

# Summarize disk usage of images
docker system df
Enter fullscreen mode Exit fullscreen mode

Container Logs and Volume Management

As containers run, they generate logs, and these logs are stored on disk by default. Especially for applications with high traffic or those generating errors, logs can quickly consume gigabytes of space. This was one of the biggest contributors to disk fullness on my VPS.

In addition, Docker volumes also occupy significant space. Volumes are used for container data storage and, by default, persist even after the container is deleted. Unused or forgotten volumes can over time turn into a large disk pile.

⚠️ Pay Attention to Log Sizes

Uncontrolled growth of log files not only consumes disk space but also negatively affects log reading and processing performance. I noticed how unnecessary logs increased disk I/O, especially when monitoring through journald.

To limit log sizes, we can use the logging configuration in the docker-compose.yml file. I usually solve this problem by setting max-size and max-file limits.

version: '3.8'
services:
  app:
    image: myapp:latest
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"
Enter fullscreen mode Exit fullscreen mode

For managing volumes, it's first necessary to understand which volumes are in use. The docker volume ls command lists all volumes. Then, we can inspect their details with docker volume inspect <volume_name>.

# Find unused (dangling) volumes
docker volume ls -f dangling=true

# Remove a specific volume (be careful, it can lead to data loss!)
docker volume rm <volume_name>
Enter fullscreen mode Exit fullscreen mode

Build Cache and Development Environments

If you are building Docker images on your VPS, the build cache can also be a significant disk consumer. Docker stores the build cache to reuse image layers. This shortens build times but can lead to the accumulation of unnecessary cache layers over time.

I frequently encountered this situation when running my CI/CD pipeline on self-hosted runners. Especially when I performed frequent experiments and failed builds, I noticed the build cache taking up more space than expected.

The docker builder prune command is very effective for cleaning the build cache. This command can free up a significant amount of disk space by cleaning up unused build cache.

# Clean all unused build cache
docker builder prune
Enter fullscreen mode Exit fullscreen mode

A Real Incident: The Situation on My Own VPS

On the morning of April 28th, the disk usage on my 7.6 GB VPS, which was hosting the backend of a side project, reached 95%. I received a disk full alert through my promtail agent monitoring journald logs. The first place I checked, of course, was the /var/lib/docker directory.

# Check how much space the Docker root directory is occupying
sudo du -sh /var/lib/docker
Enter fullscreen mode Exit fullscreen mode

The output didn't surprise me: 6.5G /var/lib/docker. The remaining 1.1 GB was largely filled with the operating system and other small files. This indicated that I needed to intervene quickly. Fortunately, this wasn't my first experience. I had encountered similar situations before.

ℹ️ Quick Overview: docker system df

The docker system df command provides a quick and understandable summary of Docker's disk usage. It shows how much space is used for images, containers, volumes, and build cache. This is a great starting point for understanding where the problem lies at first glance.

I was met with an output like this:

TYPE            TOTAL     ACTIVE    SIZE      RECLAIMABLE
Images          55        12        3.2GB     2.8GB (87%)
Containers      18        6         1.5GB     1.3GB (86%)
Local Volumes   25        8         1.8GB     1.5GB (83%)
Build Cache     12        0         0B        0B
Enter fullscreen mode Exit fullscreen mode

According to this output, images, containers, and local volumes were the main sources of disk usage. The RECLAIMABLE column clearly showed how much space could be recovered. The Build Cache section showing 0B indicated that I wasn't performing build operations at that moment.

Step-by-Step Cleaning and Optimization

After analyzing the situation, I began the cleaning process step by step. My goal was to reclaim as much disk space as possible. First, I started with a general command that cleans all unused Docker resources.

# Clean all dangling images, stopped containers, unused networks, and build cache
docker system prune

# Clean all unused Docker resources, including non-dangling images
# WARNING: This command can also delete images that are not actively used but are tagged with a name.
# Therefore, it should be used with caution, and you should be sure of what is being deleted.
# I generally use it in test environments or during disk space crises.
docker system prune -a
Enter fullscreen mode Exit fullscreen mode

With the docker system prune -a command, I recovered about 2.5 GB of space. However, this was not enough. Especially old logs and some volumes were still occupying space. To clean logs, I could manually clear the logs of running containers or adjust log rotation.

To reset the logs of a running container:


bash
# Find the Container ID
docker ps

# Find the log file and clear it
# Note: This is a manual approach and might not be ideal for automated systems.
# For persistent solutions, see log rotation configurations.
echo "" | sudo tee /var/lib/docker/containers/<container_id>/<container_id>-json.log
Enter fullscreen mode Exit fullscreen mode

Top comments (0)