Haripriya Veluchamy

Posted on Dec 23, 2025 • Edited on Jan 11

Docker Volumes and Data Persistence: Managing State in Containers 💾

#docker #devops #containers #cloud

One of the most challenging aspects of working with Docker has been figuring out data persistence. Containers are ephemeral by nature, but most real-world applications need to store data. In this post, I'll share what I've learned about managing persistent data with Docker.

The Ephemeral Nature of Containers

First, let's understand the problem. Docker containers have a virtual file system that resets when a container is removed. Here's what happens:

# Start a container and create a file
docker run -it --name temp ubuntu bash
# (Inside container) touch /test.txt
# (Inside container) exit

# Start the container again - file still exists
docker start -i temp
# (Inside container) ls /test.txt
# Output: /test.txt

# Now remove and recreate the container
docker rm temp
docker run -it --name temp ubuntu bash
# (Inside container) ls /test.txt
# Output: ls: cannot access '/test.txt': No such file or directory

When the container is removed, all data inside is lost. This is a big problem for databases, user uploads, or any stateful application.

Docker Volumes

Docker volumes are the solution to this problem. They're specially designed locations outside of the container's filesystem where data can persist.

Creating and Managing Volumes

# Create a volume
docker volume create my-data

# List volumes
docker volume ls

# Inspect a volume
docker volume inspect my-data

# Remove a volume
docker volume rm my-data

# Remove all unused volumes
docker volume prune

Volumes are stored in a location managed by Docker, typically /var/lib/docker/volumes/ on Linux systems.

Using Volumes with Containers

# Run a container with a volume
docker run -v my-data:/app/data nginx

# Run a container with an anonymous volume
docker run -v /app/data nginx

In the first example, my-data is the volume name, and /app/data is the mount point inside the container. Any data written to /app/data will persist in the my-data volume.

Types of Docker Storage

There are three main ways to persist data with Docker:

1. Named Volumes

docker run -v my-logs:/var/lib/mysql/data mysql

How it works: Docker creates and manages this volume. The data is stored in /var/lib/docker/volumes/my-logs/_data on the host, but you typically don't need to access it directly.

Use cases:

Production databases
Application data that needs to persist
Data that needs to be shared between containers

Advantages:

Managed by Docker
Easy to back up
Can be shared between containers
Works across platforms (Windows, Mac, Linux)

2. Bind Mounts

docker run -v /home/host/data:/var/lib/mysql/data mysql

How it works: Maps a directory on your host machine directly into the container. Any files in /home/host/data will be available inside the container at /var/lib/mysql/data and vice versa.

Use cases:

Development environments (for real-time code changes)
Configuration files
Sharing files between host and containers

Advantages:

Direct access to files from host machine
No need to copy files into the container
Changes on the host immediately visible in container

3. Anonymous Volumes

docker run -v /var/lib/mysql/data mysql

How it works: Similar to named volumes but with a randomly generated name. Docker itself takes care of the volume creation in host, we just mention the path in the Docker container.

Use cases:

Temporary data that should outlive a specific container instance
When you don't need to reference the volume later

Bind Mounts vs. Named Volumes: Choosing the Right Option

I spent a long time figuring out when to use which option. Here's what I learned:

Feature	Bind Mounts	Named Volumes
Location	Any directory on host	Managed by Docker in `/var/lib/docker/volumes`
Path Specification	Full host path required	Just the volume name
Portability	Less portable (host-dependent)	More portable
Host Modification	Can be modified directly on host	Requires Docker commands
Performance	Depends on host filesystem	Optimized by Docker
Usage	Development, config files	Production data

I generally use:

Named volumes for production data
Bind mounts for development or when I need to edit files directly

Volume Drivers

Docker supports volume drivers that extend storage capabilities:

# Create a volume with a specific driver
docker volume create --driver=local my-volume

# Create a volume with driver options
docker volume create --driver=local \
  --opt type=nfs \
  --opt o=addr=192.168.1.1,rw \
  --opt device=:/path/to/dir \
  my-nfs-volume

Common volume drivers:

local: Default local driver
nfs: For NFS mounts
Cloud storage drivers: For AWS EBS, Azure Disk, etc.

Data Management Strategies

Stateless vs. Stateful Containers

Stateless containers don't store persistent data:

Web servers
Application servers
Microservices
Worker processes

Stateful containers need to store data:

Databases
Caching services
File storage services
Message queues

I've found it's best to:

Make as many components stateless as possible
Use volumes only for truly stateful parts of the application
Consider using managed services for stateful components (e.g., RDS for databases)

Data Backup and Recovery

Backing up volume data is essential. Here's how I do it:

# Backup a volume to a tar file
docker run --rm -v my-volume:/source -v $(pwd):/backup \
  alpine tar -czf /backup/my-volume-backup.tar.gz -C /source .

# Restore from a backup
docker run --rm -v my-volume:/target -v $(pwd):/backup \
  alpine sh -c "tar -xzf /backup/my-volume-backup.tar.gz -C /target"

For automated backups, I put this in a cron job or CI/CD pipeline.

Sharing Data Between Containers

There are two main ways to share data between containers:

1. Using a shared volume:

# Create a shared volume
docker volume create shared-data

# Use it in multiple containers
docker run -v shared-data:/app/data container1
docker run -v shared-data:/app/data container2

Real-World Examples

Running a Database with Persistent Storage

# Create a volume for the database
docker volume create postgres-data

# Run PostgreSQL with the volume
docker run -d \
  --name postgres \
  -e POSTGRES_PASSWORD=mysecretpassword \
  -v postgres-data:/var/lib/postgresql/data \
  -p 5432:5432 \
  postgres

Now, even if the container is removed, the data will persist in the postgres-data volume.

Development Environment with Code Mounting

# Mount current directory for development
docker run -d \
  --name node-app \
  -v $(pwd):/app \
  -w /app \
  -p 3000:3000 \
  node:14 \
  npm start

This mounts your current directory into the container at /app. When you change code on your host, it's immediately reflected in the container.

Sharing Configuration Files

# Mount a specific config file
docker run -d \
  --name nginx \
  -v $(pwd)/nginx.conf:/etc/nginx/nginx.conf:ro \
  -p 80:80 \
  nginx

The :ro suffix makes the mount read-only, preventing the container from modifying your config file.

Best Practices I've Learned

Use named volumes for important data
- They're easier to manage and backup
Use bind mounts during development
- For real-time code changes without rebuilding
Make containers as stateless as possible
- Easier scaling and recovery
Be careful with permissions
- Container users must have proper permissions on mounted volumes
Always back up volumes
- Persistence isn't the same as backup
Consider volume labels for organization

   docker volume create --label project=myapp myapp-data

Clean up unused volumes regularly

   docker volume prune

Managing Volume Permissions

One issue I frequently ran into was permission problems with volumes. Here's how I solved it:

# Set permissions before mounting
docker run --rm -v my-volume:/data alpine chmod 777 /data

Conclusion

Understanding Docker volumes has been essential for my containerized applications. The ephemeral nature of containers makes volumes necessary for any application that needs to store data.

To summarize:

Use named volumes for persistent data
Use bind mounts for development
Choose the right storage strategy for your application's needs
Remember to back up your volumes

In the next post, I'll cover Docker networking - how containers communicate with each other and the outside world.

Next up: "Docker Networking: Connecting Containers"

DEV Community

Docker Volumes and Data Persistence: Managing State in Containers 💾

The Ephemeral Nature of Containers

Docker Volumes

Creating and Managing Volumes

Using Volumes with Containers

Types of Docker Storage

1. Named Volumes

2. Bind Mounts

3. Anonymous Volumes

Bind Mounts vs. Named Volumes: Choosing the Right Option

Volume Drivers

Data Management Strategies

Stateless vs. Stateful Containers

Data Backup and Recovery

Sharing Data Between Containers

Real-World Examples

Running a Database with Persistent Storage

Development Environment with Code Mounting

Sharing Configuration Files

Best Practices I've Learned

Managing Volume Permissions

Conclusion

Top comments (0)