DEV Community

Cover image for Why standard deviation is important in performance tests
Grzegorz Piechnik
Grzegorz Piechnik

Posted on

1

Why standard deviation is important in performance tests

Many performance testers focus on metrics such as average response time, median and percentiles to diagnose potential performance errors. Because of this, "side" metrics such as standard deviation, among others, are overlooked. This is a particular mistake, as the devil is in the details. So what is standard deviation?

What is standard deviation

The mathematical formula for standard deviation is as follows:

s = √(s^2) = √(Σ(x - X̄)^2 / n)

Blah blah blah... Let's skip the math and get to the specifics.

Why does standard deviation affect performance analysis?

A small standard deviation means that the data is highly concentrated around the mean, while a large standard deviation indicates a greater dispersion of values around the mean. This is useful when comparing data distributions. For us, as performance testers, a large standard deviation means that there is a sizable discrepancy between the results obtained.

For example - if the standard deviation for response times is high, it will mean to us that the discrepancy between results in response times is large. This could indicate potential performance problems in the application.

How to calculate the standard deviation

If we already know why it has such a big impact on performance, let's take a "peasant" look at how to calculate the standard deviation.

  1. Calculate the average value of the data by summing all the values and dividing by the number of elements in the data set (n).
  2. For each data value, calculate the difference between that value and the mean.
  3. Raise each difference to the square.
  4. Calculate the sum of squares of the differences.
  5. Divide the sum of squares of the differences by the number of elements in the data set minus 1 (n-1). This is the variance.
  6. Calculate the square root of the variance. This is the standard deviation.

To simplify the calculation, let's assume that our test results look as follows:

Step RT 1 RT 2 RT 3 RT 4 RT 5 Average Percentil 90
Login 3 4 1 3 2 2.6 4
Register 2 5 1 3 29 8 29
Blog 1 4 3 2 1 2.2 4
Check post 3 2 3 3 3 2.8 3

RT Means response time indicated in seconds. The calculated data for each step looks as follows:

Step RT 1 RT 2 RT 3 RT 4 RT 5 Average Percentil 90 Standard deviation
Login 3 4 1 3 2 2.6 4 1.01
Register 2 5 1 3 29 8 29 10.58
Blog 1 4 3 2 1 2.2 4 1.16
Check post 3 2 3 3 3 2.8 3 0.4

Analyzing the standard deviation of the above table, we can conclude that potential performance problems occur at the Register step.

Summary

Standard deviation is a mathematical measure that can easily give us information that there are potential performance problems in an application. We can use standard deviation for both response times and other measures.

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up