DAVID JORDAN ANAMPA PANCCA

Posted on Dec 4, 2025

Observability Practices: Implementing Real-World Monitoring With Python and Prometheus

#monitoring #devops #tutorial #python

Modern applications don’t just need to run — they need to be understood. When something goes wrong in production, teams must be able to detect issues, diagnose the root cause, and monitor the system’s behavior in real time.
This is where observability becomes essential.

In this article, I explain what observability is, why it matters, and how I implemented a real-world example using Python, Prometheus, and FastAPI. You can use this code to build your own monitoring pipeline.

What Is Observability?

Observability is the ability to understand the internal state of a system based on the data it produces.

It is built around three core pillars:

1. Metrics

Numeric values that reflect system state.
Examples: request latency, CPU usage, memory consumption.

2. Logs

Detailed event records generated by applications and systems.
Examples: authentication messages, errors, warnings.

3. Traces

End-to-end tracking of requests across services.
Useful in microservices and distributed systems.

Together, these help answer:

What is happening?
Why is it happening?
Where is it failing?

Why Observability Matters

Observability helps teams:
Detect issues earlier
Reduce downtime
Improve performance
Understand user impact
Monitor applications at scale
Make data-driven decisions

Without observability, debugging becomes slow, reactive, and inconsistent.

Real-World Example: Observability With Python + Prometheus

For this example, I implemented observability on a small API using:

Python
FastAPI
Prometheus (metrics collection)
Grafana (optional dashboards)

This setup is commonly used in startups and cloud-native environments.

1. Install Dependencies

First, install the required packages:

pip install fastapi uvicorn prometheus-client

2. Python API With Prometheus Metrics

Below is a simple FastAPI application that exposes metrics at /metrics.
Prometheus will scrape this endpoint every few seconds.

__from fastapi import FastAPI
from prometheus_client import Counter, Histogram, generate_latest
from fastapi.responses import Response
import time
import random

app = FastAPI()

REQUEST_COUNT = Counter("api_requests_total", "Total number of API requests received")
REQUEST_LATENCY = Histogram("api_request_latency_seconds", "API request latency")

@app.get("/")
def home():
REQUEST_COUNT.inc()
with REQUEST_LATENCY.time():
time.sleep(random.uniform(0.1, 0.5))
return {"message": "API is running successfully"}_

@app.get("/metrics")
def metrics():
return Response(generate_latest(), media_type="text/plain")_

What this code does:
Metric Description
api_requests_total Counts all incoming requests
api_request_latency_seconds Measures request duration

These metrics help determine whether the API is fast, overloaded, or failing.

3. Prometheus Configuration

Create a file named prometheus.yml:

_global:
scrape_interval: 5s

scrape_configs:

job_name: "python-api" static_configs:
- targets: ["localhost:8000"]_

Prometheus will scrape the metrics endpoint at:

http://localhost:8000/metrics

4. Run Prometheus

Download Prometheus, then run it:

./prometheus --config.file=prometheus.yml

Open the Prometheus UI at:
_
http://localhost:9090_

Query metrics like:

api_requests_total
rate(api_requests_total[1m])
api_request_latency_seconds_bucket

5. Optional: Grafana Dashboard

Grafana can visualize your Prometheus metrics with modern dashboards.

Typical graphs include:

Request rate
CPU and memory usage
Error percentage
Latency (p95, p99)

This is valuable when demonstrating observability to teams or stakeholders.

Observability Best Practices

To implement observability professionally:

✔ Instrument every major endpoint

Expose metrics for performance-critical APIs.

✔ Standardize metric names

Avoid random or unstructured naming.

✔ Include labels (tags)

Labels such as status_code, endpoint, or method add context.

✔ Use alerts

For example:
“95th percentile latency exceeds 500ms for 3 minutes.”

✔ Visualize everything

Dashboards make patterns obvious.

✔ Combine logs, metrics, and traces

Observability works best when all three pillars are present.

Conclusion

Observability allows teams to deeply understand how their systems behave.
Using Prometheus + FastAPI, I demonstrated how to expose useful metrics that support:

Faster debugging
Better performance insights
Safer deployments
Scalable system monitoring

This example can be expanded with tracing (OpenTelemetry), log pipelines (ELK Stack), or full cloud observability platforms like AWS CloudWatch, Datadog, or Azure Monitor.

References

Prometheus Documentation – https://prometheus.io/docs
Grafana Documentation – https://grafana.com/docs
FastAPI – https://fastapi.tiangolo.com
OpenTelemetry – https://opentelemetry.io

Top comments (2)

AHMED HASAN AKHTAR OVIEDO • Dec 4 '25

Tu artículo está claro y explica la observabilidad de forma sencilla. El ejemplo con Python y Prometheus está bien hecho y funciona como guía rápida. Podrías hacer el final un poco más corto para que cierre con más fuerza, pero en general quedó práctico y fácil de seguir.

JOAN CRISTIAN MEDINA QUISPE • Dec 4 '25

This article is a fantastic kickoff, giving us a super clear definition of Observability and its three musketeers: Metrics, Logs, and Traces! The author did a great job picking a relevant tech stack (Python/FastAPI and Prometheus) and the code is a great starter kit. However, it’s like throwing a pizza party and only serving one topping—we got tons of delicious Metrics, but the Logs and Traces are missing in action (MIA) in the actual code examples, which stops it from being truly "real-world" Observability. For the next iteration, I'd suggest the author stop being stingy with the labels! Adding crucial tags like endpoint to the metrics would make the code genuinely useful, and maybe sneak in a fun security tip, reminding us that the /metrics endpoint is a VIP-only party that shouldn't be publicly exposed.