Introduction
Quick recap of previous article:
Showed how to use OpenTelemetry & Jaeger (Docker Tutorial) to trace requests in a Spring Boot app.
Focused on tracing to understand where requests spend time.
Motivation for this article:
Tracing is powerful, but observability is not complete without metrics.
Need dashboards & alerts to monitor the health of applications.
Goal: By the end, readers will have a Spring Boot app instrumented with Prometheus + Grafana running alongside Jaeger.
Why Metrics, If We Already Have Traces?
Difference between metrics and traces:
- Metrics = system health over time (e.g., request latency, error rate).
- Traces = detailed view of a single request journey.
Show a real-world example:
In the Jaeger UI, we see a slow DB query in a trace. That’s useful, but what if we want to know how often it happens and whether it’s getting worse over time?
Metrics + dashboards provide that context.
Together, Jaeger (traces) + Prometheus/Grafana (metrics) give a holistic observability setup.
Setting Up Jaeger
Jaeger is an open-source distributed tracing system (originally built at Uber, now a CNCF project). Please check previous article OpenTelemetry & Jaeger
Setting Up Prometheus
Prometheus scrapes metrics from HTTP endpoints at regular intervals. You configure targets in prometheus.yml:
- Define scrape intervals (typically 15-30 seconds)
- Specify target endpoints where metrics are exposed
- Configure alerting rules for threshold violations
- Set up service discovery for dynamic environments
- Applications expose metrics on /metrics endpoints using client libraries. Common metrics include request counts, response times, error rates, and custom business metrics.
Spring Boot exposes metrics out-of-the-box using Micrometer. By enabling the prometheus actuator endpoint, our app automatically provides a /actuator/prometheus endpoint. However, will use OpenTelemetry instead of Micrometer for exposing Prometheus metrics. OpenTelemetry provides a more unified approach to observability (metrics, traces, and logs) and is becoming the industry standard.
Prometheus Server Configuration (prometheus.yml)
global:
scrape_interval: 5s
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "spring-boot-app"
static_configs:
- targets: ["docker-demo:9464"]
You will see Prometheus in action at http://localhost:9090
- OTEL_EXPORTER_PROMETHEUS_PORT=9464
- OTEL_EXPORTER_PROMETHEUS_HOST=0.0.0.0
then your Spring Boot app also exposes its own metrics on port 9464→ /metrics.
Setting Up Grafana
Grafana is our visualization layer. It doesn’t store data itself — instead, it connects to Prometheus and queries metrics.
- Install Grafana server
- Add Prometheus data source with connection URL
- Import or create dashboards with panels showing metrics over time
- Set up alerting based on query results
- Configure user authentication and permissions
Steps:
- Run Grafana with Docker.
- Open http://localhost:3000
- (default login: admin/admin).
- Add Prometheus as a data source (http://prometheus:9090) (Because Grafana and Prometheus are on the same Docker network, we use the container name, not localhost).
- Import a prebuilt dashboard or create your own panels for latency, throughput, and errors.
Docker Compose: All-in-One Setup
When used together, these tools provide comprehensive observability:
Metrics + Traces: Grafana dashboards show high-level trends, while Jaeger provides detailed trace analysis when issues occur.
Alerting Workflow: Prometheus alerts trigger when metrics exceed thresholds, teams can then use Jaeger to investigate specific problematic requests.
Root Cause Analysis: Start with Grafana dashboards to identify when problems occurred, use Prometheus queries to narrow down affected services, then examine detailed traces in Jaeger.
Here’s a simple stack with Sample Spring boot app + Prometheus + Grafana and Jaeger:
docker-compose.yml
version: '3.8'
services:
# Jaeger - Tracing Backend
jaeger:
image: jaegertracing/all-in-one:1.51
container_name: jaeger
ports:
- "16686:16686" # Jaeger UI
- "14250:14250" # Jaeger gRPC
- "4318:4318" # OTLP HTTP
environment:
- COLLECTOR_OTLP_ENABLED=true
networks:
- app-network
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
networks:
- app-network
grafana:
image: grafana/grafana-oss:latest
container_name: grafana
ports:
- "3000:3000"
environment:
- GF_SECURITY_ADMIN_USER=admin
- GF_SECURITY_ADMIN_PASSWORD=admin
volumes:
- grafana-data:/var/lib/grafana
depends_on:
- prometheus
networks:
- app-network
# Your Spring Boot Application
docker-demo:
image: docker-demo:latest
container_name: docker-demo
ports:
- "8080:8080"
- "9464:9464" # OTel Prometheus exporter
environment:
# OpenTelemetry Configuration
- OTEL_SERVICE_NAME=docker-demo
- OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318
- OTEL_TRACES_EXPORTER=otlp
- OTEL_METRICS_EXPORTER=prometheus
- OTEL.EXPORTER.PROMETHEUS.PORT=9464
- OTEL.EXPORTER.PROMETHEUS.HOST=0.0.0.0
- OTEL_LOGS_EXPORTER=none
- OTEL_TRACES_SAMPLER=always_on # Sample ALL traces
- OTEL_INSTRUMENTATION_COMMON_DEFAULT_ENABLED=true
- OTEL_INSTRUMENTATION_HTTP_ENABLED=true
- OTEL_INSTRUMENTATION_SPRING_WEB_ENABLED=true
- OTEL_LOG_LEVEL=DEBUG
# Application Configuration
- JAVA_OPTS=-javaagent:/app/opentelemetry-javaagent.jar
networks:
- app-network
volumes:
grafana-data:
networks:
app-network:
driver: bridge
Start the stack
docker-compose up -d
Jaeger → http://localhost:3000
Grafana → http://localhost:3000
(login: admin / admin)
Prometheus → http://localhost:9090
Since you’re using the OTel Prometheus exporter, your app should expose metrics at:
Prometheus has a built-in UI to show scrape status.
Go to:
http://localhost:9090/targets
You should see a list of jobs (e.g., spring-boot-app) with:
State = UP → Prometheus is successfully scraping your app.
Last Scrape / Last Scrape Duration → confirms timing.
Error message if scraping failed (bad host/port/path).
Find Available Metrics
Add Jaeger as a data source in Grafana:
Explore Traces in Grafana
- Open Explore → Select Jaeger as data source
- Query traces by service name (e.g., springboot-app)
- You’ll see the same traces as in Jaeger UI, but inside Grafana.
Add Prometheus as a data source in Grafana:
Import a Prometheus dashboard from Grafana’s dashboard library (e.g., ID: 3662 for Prometheus 2.0 overview).
Now you’ve got a working Grafana + Prometheus local setup!
Create a Dashboard
There are Simple Two different ways to create this dashboard in Grafana:
Method 1: Import via JSON (Recommended)
- Copy the JSON from the artifact above
- Open Grafana (http://localhost:3000)
- Go to Dashboards → Import
- Paste the JSON in the "Import via panel json" text box
- Click "Load"
- Configure data source: Select your Prometheus data source
- Click "Import"
Method 2: Manual Creation (Step-by-Step)
- Create New Dashboard
- Go to Dashboards → New Dashboard
- Click "Add visualization"
- Add panels accordingly
In this post, I’ll be using Method 1: Import via JSON (recommended) to set up the Grafana dashboard.
Generate some traffic:
bash
# Make requests to your app
for i in {1..100}; do
curl http://localhost:8080/api/customers
done
You will see the request count in the Prometheus dashboard by using the http_server_request_duration_seconds_count
metric. you can also search like-
-
http_server_request_duration_seconds_count
→ monotonically increasing counter (total requests since start). rate(http_server_request_duration_seconds_count[1m])
→ request rate per second (much better for dashboards).This shows requests per second, grouped by http_route.
sum(rate(http_server_request_duration_seconds_count[5m])) by (http_route)
- Average Load Time (Latency)
rate(http_server_request_duration_seconds_sum[5m])
/
rate(http_server_request_duration_seconds_count[5m])
Important: The dashboard uses common OpenTelemetry metric names. If your metrics have different names, update the queries:
Find Your Actual Metrics
Go to Prometheus (http://localhost:9090)
Run this query: {job="spring-boot-app"}
Note the actual metric names
Let’s create a simple Grafana dashboard JSON for your Spring Boot app that includes:
- API Request Count (Throughput)
- Average Latency (Load Time)
- Error Rate
- P95 Latency (optional but very useful)
Steps to use:
- Copy the JSON into a file (e.g. otel-dashboard.json).
- In Grafana → Dashboards → Import → Upload JSON file.
- Select your Prometheus data source when asked.
{
"__inputs": [
{
"name": "DS_PROMETHEUS",
"label": "Prometheus",
"type": "datasource",
"pluginId": "prometheus",
"pluginName": "Prometheus"
}
],
"id": null,
"title": "OpenTelemetry API Monitoring Dashboard",
"tags": ["api", "opentelemetry", "prometheus"],
"timezone": "browser",
"schemaVersion": 36,
"version": 1,
"panels": [
{
"type": "timeseries",
"title": "API Request Count",
"targets": [
{
"expr": "sum(rate(http_server_request_duration_seconds_count[5m])) by (http_route)",
"legendFormat": "{{http_route}}",
"datasource": { "type": "prometheus", "uid": "${DS_PROMETHEUS}" }
}
],
"gridPos": { "x": 0, "y": 0, "w": 12, "h": 8 }
},
{
"type": "timeseries",
"title": "Average Response Time (Latency)",
"targets": [
{
"expr": "rate(http_server_request_duration_seconds_sum[5m]) / rate(http_server_request_duration_seconds_count[5m])",
"legendFormat": "avg_latency",
"datasource": { "type": "prometheus", "uid": "${DS_PROMETHEUS}" }
}
],
"fieldConfig": {
"defaults": {
"unit": "s",
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "green", "value": null },
{ "color": "orange", "value": 1 },
{ "color": "red", "value": 2 }
]
}
}
},
"gridPos": { "x": 12, "y": 0, "w": 12, "h": 8 }
},
{
"type": "timeseries",
"title": "Error Rate",
"targets": [
{
"expr": "sum(rate(http_server_request_duration_seconds_count{http_response_status_code!~\"2..\"}[5m])) / sum(rate(http_server_request_duration_seconds_count[5m]))",
"legendFormat": "error_rate",
"datasource": { "type": "prometheus", "uid": "${DS_PROMETHEUS}" }
}
],
"fieldConfig": {
"defaults": {
"unit": "percentunit",
"thresholds": {
"mode": "absolute",
"steps": [
{ "color": "green", "value": null },
{ "color": "orange", "value": 0.05 },
{ "color": "red", "value": 0.1 }
]
}
}
},
"gridPos": { "x": 0, "y": 8, "w": 12, "h": 8 }
},
{
"type": "timeseries",
"title": "95th Percentile Latency (p95)",
"targets": [
{
"expr": "histogram_quantile(0.95, sum(rate(http_server_request_duration_seconds_bucket[5m])) by (le))",
"legendFormat": "p95_latency",
"datasource": { "type": "prometheus", "uid": "${DS_PROMETHEUS}" }
}
],
"fieldConfig": {
"defaults": { "unit": "s" }
},
"gridPos": { "x": 12, "y": 8, "w": 12, "h": 8 }
}
],
"templating": { "list": [] },
"annotations": { "list": [] },
"time": { "from": "now-30m", "to": "now" }
}
🎉 Congratulations! You’ve successfully built an end-to-end monitoring workflow by integrating Spring Boot with Prometheus and Grafana. Next, Link Prometheus → Jaeger for more powerful trace links.
Tempo (Grafana’s tracing backend) natively integrates better with Prometheus. you can explore alerting with Prometheus Alertmanager and Grafana to get notified on errors, high latency, or unusual traffic patterns.
Cleanup
Stop All containers
docker stop $(docker ps -q)
Remove all containers
docker rm $(docker ps -aq)
To remove all the stopped containers
docker rm $(docker ps -q -f status=exited)
References & Credits
AI tools were used to assist in research and writing but final content was reviewed and verified by the author.
Additional Sources:
Top comments (0)