Day 5 -- Prometheus metrics collection.
The most impactful thing you can learn about Prometheus is not how to write queries. It is how to instrument your code so the queries are possible and the server stays alive.
Quick Node.js Instrumentation
const client = require('prom-client');
const express = require('express');
const app = express();
const register = new client.Registry();
client.collectDefaultMetrics({ register });
const httpDuration = new client.Histogram({
name: 'http_request_duration_seconds',
help: 'Request duration by route',
labelNames: ['route', 'method', 'status_class'],
buckets: [0.01, 0.05, 0.1, 0.2, 0.3, 0.5, 1.0],
registers: [register],
});
app.use((req, res, next) => {
const end = httpDuration.startTimer();
res.on('finish', () => {
end({ route: req.route?.path || 'unknown', method: req.method, status_class: `${Math.floor(res.statusCode/100)}xx` });
});
next();
});
app.get('/metrics', async (req, res) => {
res.set('Content-Type', register.contentType);
res.end(await register.metrics());
});
app.listen(3000);
The Cardinality Rule
Every unique label value combination = one time series in memory. Use only bounded values as labels. Never user IDs, IPs, or request IDs.
Monitor your own cardinality:
-
prometheus_tsdb_head_series-- total active series -
scrape_series_added-- new series per scrape
Alert when either spikes after a deploy.
Bucket Selection
Align histogram buckets to your SLO boundaries. Default Prometheus buckets jump from 100ms to 250ms -- useless for a 200ms SLO target. Dense coverage near your threshold gives accurate percentile calculations.
Top comments (0)