I've prepared a guide to understand when my containers slow down. In this post, I'll explain step-by-step how I monitor the performance of Docker containers on my own VPS. Sometimes a simple configuration change, and other times insufficient underlying system resources, can significantly impact performance. Systematically managing this process is critical for early detection and resolution of potential issues.
Monitoring the performance of applications running on my own servers is particularly important to me. While Docker's isolation and portability are excellent, it's crucial to properly manage and monitor the underlying system resources these containers rely on. In this guide, I'll share how I manage this process and resolve typical issues I encounter.
Fundamentals of Container Performance Monitoring
There can be many reasons why a Docker container slows down. These include factors such as CPU usage, memory consumption, I/O bottlenecks, or network latencies. The first step is to systematically collect and analyze these core metrics. When performing this monitoring on my own VPS, I typically leverage tools like systemd, cgroup, and journald.
These tools provide a strong foundation for understanding resource utilization by containers and the underlying host system. Specifically, cgroups are used to limit and monitor how much CPU, memory, and other resources containers can use. journald, on the other hand, collects system and service logs in a central location, which incredibly simplifies the debugging process.
Systemd Units and Container Management
Docker services are typically managed by systemd. This means that operations such as starting, stopping, and restarting your containers can be controlled via systemd. Instead of monitoring containers individually, it can be more efficient to track their main services and their resource usage.
By examining systemd unit files, you can see how services are started, what dependencies they have, and what resource limits are applied. This is a good starting point for finding the source of a problem, especially when your containers are not behaving as expected.
systemctl status docker
systemctl status containerd
These commands show the status of the Docker daemon and the container runtime. If these services are not running correctly, the containers running beneath them will also be affected.
Understanding Resource Limits with Cgroup
One of the most important ways to understand container performance is by monitoring resource usage through cgroup (control groups). Docker uses cgroups to isolate containers and manage their resources. A separate cgroup is created for each container, allowing you to fine-tune resources such as CPU, memory, and disk I/O.
Directly inspecting cgroups can be a bit complex, but the docker stats command presents this information in a more understandable way. This command shows real-time CPU, memory, network I/O, and disk I/O statistics for your running containers.
docker stats
Regularly checking the output of this command helps you understand which container is consuming the most resources. If a container is consistently using a high percentage of the CPU or if you observe sudden spikes in memory usage, this could indicate a performance issue with that container.
ℹ️ Out-Of-Memory (OOM) and Cgroup
If a container exceeds its allocated memory limit, the Linux kernel's Out-Of-Memory (OOM) killer can terminate the process. This causes your container to stop abruptly.
cgroups are used to prevent this; settings likememory.highhelp softly limit memory usage.
Log Analysis with Journald
Logs are critically important for understanding the behavior of your containers or the services running beneath them. journald is a central service used in Linux systems to collect and manage logs. Docker containers can also generate their logs via journald. This allows you to examine not only the container's own logs but also the logs of related services on the host system from a single location.
The journalctl command is used to query and filter logs collected by journald. You can use this command to view logs for a specific container or to examine error messages within a particular time range.
journalctl -u docker.service -f
journalctl -u container-your-container-name.service -f
journalctl --since "2026-05-12 00:00:00" --until "2026-05-13 00:00:00"
The first command shows live logs for the Docker service, while the second does the same for a specific container service (if you manage the container as a systemd service). The third command retrieves all system logs within a specified date range. These logs can provide clues to understand the cause of slowdowns. For example, frequently recurring error messages or unexpected warnings can point to performance issues.
Common Causes of Slowdowns and Their Solutions
So far, we've covered basic monitoring tools and methods. Now, let's focus on some common causes of slowdowns I've encountered on my own VPS and how I've resolved them. These scenarios aim to provide practical applications beyond a general guide.
CPU Bottlenecks
If you see a container consistently showing high CPU usage in the docker stats output, this could be due to several reasons. The application itself might require intensive processing power, or it might have entered an infinite loop. Sometimes, the CPU resources allocated to the container might simply be insufficient.
In this situation, the first step is to understand how much CPU the application running inside that container is using. If necessary, you can run tools like top or htop inside the container for a more detailed analysis. If the problem lies with the application itself, optimization might be needed. If the issue is insufficient resources, you might need to allocate more CPU resources to the container.
⚠️ CPU Limits and Performance
Setting CPU limits is important to prevent a container from consuming all server resources. However, applying excessive restrictions can also lead to performance degradation. You can set these limits using the
--cpusparameter in thedocker runcommand or thecpussetting in yourdocker-compose.ymlfile.
Memory Leaks
Memory leaks occur when an application fails to release memory it has used after a process is complete. Over time, this can completely fill the container's memory and degrade the overall system performance. The docker stats command allows you to notice a continuous increase in memory usage.
If you suspect a memory leak, you need to monitor the application's memory usage within the container to pinpoint the issue. Memory profiling tools are available for most programming languages. These tools help you understand which parts of the code are using the most memory and why the memory is not being released.
Disk I/O Bottlenecks
Intensive disk read/write operations can also cause containers to slow down. Operations such as database transactions, logging, or processing large files can increase disk I/O. The docker stats command also shows disk I/O metrics.
If disk I/O is high, you need to identify which processes are causing this intensity to find the source of the problem. This might involve optimizing database queries, adjusting logging levels, or switching to faster storage solutions. On my own VPS, I've observed increased disk I/O, especially during intensive database operations. In these cases, using tools like iotop to identify which processes were using the most disk was helpful.
Network Latencies and Configuration Issues
Network communication between containers or between a container and the outside world can also lead to performance problems. DNS resolution issues, MTU (Maximum Transmission Unit) mismatches, or misconfigured bridge networks can cause latencies.
To diagnose network issues, you can use standard network tools like ping and traceroute. It's also important to check container network settings and IP address conflicts. Especially when multiple containers are running on the same network, it's crucial to ensure that the network configuration is correct.
My Monitoring Strategies on My Own VPS
When managing the performance of containers running on my own VPS, I've adopted a few core strategies. These strategies allow me to take a proactive approach and detect issues early.
1. Regular Tracking of Core Metrics
Every day, I run the docker stats command to check the core metrics of all my running containers. Abnormal increases or decreases in CPU, memory, network I/O, and disk I/O values immediately catch my attention. This simple but effective method allows me to notice potential problems before they escalate.
Additionally, I regularly check the status of systemd services. The systemctl status <service-name> command shows whether services are active and if there are any error messages.
2. Logging and Debugging
I actively use journald. I have set up detailed logging for all my services and containers. Especially for critical services, I examine logs by searching for specific keywords or error codes using the journalctl command. This is often the fastest way to find the root cause of a problem.
💡 Journald Rate Limiting
If logs are being generated too rapidly, you can protect the system from overload by using
journald's built-in rate limiting features. This is especially important when error messages are constantly repeating.
3. Setting Resource Limits
Setting appropriate resource limits for my containers is important both for protecting the performance of other containers and for ensuring the stability of the host server. I carefully adjust parameters like --cpus and --memory in the docker run command or in the docker-compose.yml file. When determining these limits, I consider the actual needs of the application.
4. Automated Alerting Systems
For more complex systems, setting up automated alerting systems integrated with tools like Prometheus and Grafana can be beneficial. These systems alert me via email or other notification channels when certain metrics exceed specific thresholds. While I haven't yet set up such a system on my own VPS, it's definitely a method I prefer for larger-scale projects.
Conclusion and Next Steps
Monitoring the performance of Docker containers on my own VPS is a process that requires continuous learning and adjustment. The methods I've shared in this guide cover a wide range, from tracking core metrics to analyzing logs and setting resource limits.
By following these steps, you can more easily understand why your containers are slowing down and optimize their performance. Remember that every system is unique, and you may need to adapt these methods to your specific needs.
The next step is to use this monitoring data to further optimize your applications and proactively prevent performance issues.
Top comments (0)