This is a writing about a cool tool useful for analyzing backend call time. Code that does backend calls and monitoring setup described in previous post.
Grafana panel can not only plot line graphs, but also:
- show last reading of metric
- show table of metric values
- show bar plots
- show heatmaps (histogram over time)
Heatmap is helpful for quickly getting understanding what is distribution of backend response time: it can be the case that most requests complete in under 50 msec, but some requests are slow and complete in >500 msec. Average request time doesn't show this information. In previous examples, we're plotting just the average.
Need to add new panel, pick measurement details, and select "Heatmap" in "Visualization" collapsible in the right column.
Every 10 seconds, a new set of bricks appears on the panel. Brick color represents how much measurements fall into that bucket (e.g. 5 fall in the 10 msec - 20 msec range, hence that brick is pink). Set a fixed bucket size or fix the number of buckets, or let default values do their magic.
In case Telegraf sends all metrics data to InfluxDB, that's a real heatmap. Telegraf is often configured to send only aggregated values to database (min, avg, max) calculated over short period of time (10sec) in order to reduce metrics reporting traffic. Heatmap based on such aggregated value is not a real heatmap.
[[aggregators.histogram]] period = "30s" drop_original = false reset = true cumulative = false [[aggregators.histogram.config]] buckets = [1.0, 10.0, 12.0, 14.0, 16.0, 18.0, 20.0, 30.0, 40.0] measurement_name = "aiohttp-request-exec-time" fields = ["value"]
cumulative=false which will cause buckets values to be calculated anew for each 30 second period. Need to set value ranges (
buckets) manually, as well as specify correct
fields is not specified, histogram buckets are computed for all fields of measurement. Here's how bucket values appear in InfluxDB:
The amount of request execution times that falls in a bucket is saved under "value_bucket" field name, "gt" ("greater than") and "le" ("less than or equals to") are bucket edge values that appear as tags.
Let's create 2 separate panels, one for python.org stats and one for mozilla.org (add 'where domain = python.org' in query edit).