Building a Holistic Monitoring Strategy
Introduction
Concentrating just on internal services is no longer adequate as the monitoring environment expands. A genuinely resilient system must have insight into public endpoints, external-facing apps, and advanced service checks in addition to the internal network. Day 4: From basic monitoring to developing a comprehensive approach that captures the exterior experience of users as well as the internal health of services. By using sophisticated monitoring methods and external checks, I hope to improve system dependability, proactively find weaknesses, and guarantee steady performance across all infrastructure tiers.
Objective
Setting up monitors for resources outside of my lab environment and putting advanced HTTP checks into place were my objectives for day 4.
Solving a Port Mapping & Docker Container Conflict
After successfully setting up my monitoring tools on Day 3, I was eager to dive into Day 4's activities. The first step was to access the Uptime Kuma dashboard I had deployed to keep an eye on my services. I opened my browser and pointed it to http://192.168.92.134:3002. Instead of the sleek monitoring interface, I was greeted by a spinning loader and, eventually, a timeout error.
The hunt was on.
Diagnosing the network connection was my initial instinct. The worst was confirmed with a fast nc -zv 192.168.92.134 3002: Connection denied. The port was shut down. This indicated a problem with the program rather than the network firewall.
I needed to have a look inside Docker as I knew the Uptime Kuma container was meant to be operating. Using sudo lsof -i:3002 showed an intriguing discovery: docker-pr, the Docker proxy process, was actually listening on the ports. This indicated that although Docker had mapped the port, traffic on the port was not being answered by the container.
The next obvious step was to look for any startup issues in the logs of the container. Sudo docker logs uptime-kuma produced a traditional red herring as its output. The logs were flawless! They displayed a flawless startup process, a successful SQLite database connection, and no visible issues. The application believed that everything was operating smoothly.
I suspected a state issue because of this disparity, Docker claiming the port, but the application was not accessible. To fix any possible difficulties in the networking or proxy layers, I chose to use sudo systemctl restart docker to restart the entire Docker service.
Then I used sudo docker ps to list the containers after Docker was back up and running. I was shocked to discover that my initial uptime-kuma container was missing. I remembered that I had previously attempted to redeploy this container.
I discovered the source of the entire issue when I went back over my history. I had previously tried using the following command to launch a new container:
sudo docker run -d --restart=always -p 8080:3001 -v uptime-kuma:/app/data --name uptime-kuma louislam/uptime-kuma:1
However, Docker continued to send the fatal error: Conflict. "/uptime-kuma" is already being used as a container name. When I first tried to delete the old container completely, I had simply performed docker stop and a malformed docker run (without the crucial -d -p -v options) rather than docker rm. Every time I tried to quit and restart without completely clearing up the previous instance, a ghost of the former container continued to interfere.
The solution was simple but decisive:
- Remove the Conflicting Container: I ensured the old container was completely removed using docker rm.
- Clean Slate: The Docker service restart cleared any residual port conflicts.
- Precise Deployment: I carefully re-ran the correct Docker run command with all the necessary flags.
This time, it worked flawlessly. A final nc -zv 192.168.92.134 8080 returned the beautiful words: Connection to 192.168.92.134 8080 port [tcp/http-alt] succeeded!
Lesson Learned:
This setback has a very clear lesson. Docker container names are more than merely amiable labels; they are distinct identifiers. Docker stop is not usually sufficient for debugging; in order to fully fix name problems, a definite docker rm is frequently needed. Additionally, you should always confirm that the service is truly reachable from your network rather than relying just on a clean log file. This brief experience served as the ideal reminder that systematic, step-by-step troubleshooting will nearly always uncover the issue that is lying in plain sight.
Now that I could finally access Uptime Kuma, I was prepared to carry on with my monitoring adventure on Day 4.
NOW DAY 4 ACTIVITY
Detailed Procedure:
*1. Ping Monitor for Gateway: *
I added a Ping monitor to check the availability of my network's default gateway (192.168.92.2).
ip route | grep default
Add a new monitor
*2. Basic HTTP Monitor: *
I added an HTTP(s) monitor for https://httpstat.us/200 (a reliable test site).
Add a new monitor
*3.Keyword Monitor: *
I added an HTTP(s) Keyword monitor for a news site (e.g., https://www.bbc.com/news). I will set it to search for the keyword "UK" in the response to verify that the page content loads correctly.
*4. DNS Monitor: *
I added a DNS Record monitor to check that Google's DNS (8.8.8.8) correctly resolves google.com.
Dashboard Overview
My Uptime Kuma dashboard now shows a mix of internal and external monitors with different icons (globe, server, ping).
Conclusion & Success Goal Achieved
A significant change from internal-only service checks to a more comprehensive monitoring approach incorporating advanced HTTP validation, DNS health checks, and external endpoints occurred on day four. In Docker deployments, the first challenge of port conflicts and container mismanagement brought to light the significance of methodical troubleshooting, understanding of the container lifetime, and accuracy. By resolving this problem, Uptime Kuma was made completely available and operational, paving the way for more sophisticated monitoring.
Once up and running, several external monitors were effectively set up:
- Ping monitor verified that the gateway was reachable.
- The uptime of a known reliable endpoint was verified with a basic HTTP monitor.
- Through its ability to confirm that external content was loading properly, Keyword Monitor offered more insight.
- DNS monitor made sure Google's DNS server was functioning correctly for name resolution.
The dashboard now shows a thorough perspective that encompasses external dependencies in addition to the lab's internal network, reflecting the actual user experience.
Success Goal Achieved:
The goal of introducing sophisticated checks and extending monitoring to outside resources was successfully achieved. A more robust and proactive monitoring posture is ensured by the environment's increased insight into external dependability variables as well as internal services.
Top comments (0)