Vitaly Bicov

Posted on Jul 27

CPU in Linux. Load Average

#devops

Load Average is an important measurement in Linux to assess system load on average. It represents the average rate for running and waiting-for-run-queue processes on the system for 1, 5, and 15-minute time intervals. Compared to using only CPU utilization, Load Average gives system administrators a better and deeper understanding of the current load.

The Evolution of Load Average

Originally, this measurement wasn’t always so versatile. Prior to 1993, it only reflected CPU load average (similar to other Unix systems at that time) and didn’t account for other resource demands. Everything changed with a patch released on Friday, October 29, 1993, where the author stated:

“The kernel only counts ‘runnable’ processes when computing the load average. I don’t like that; the problem is that processes which are swapping or waiting on ‘fast,’ i.e. noninterruptible, I/O, also consume resources. It seems somewhat nonintuitive that the load average goes down when you replace your fast swap disk with a slow swap disk… Anyway, the following patch seems to make the load average much more consistent WRT the subjective speed of the system. And, most important, the load is still zero when nobody is doing anything.”

The Big Takeaway

While the exact evolution of the code after that patch hasn’t been fully explored here, the crucial point remains: from that moment on, people began thinking of Load Average not merely as CPU load but as an indicator of overall system load.

How Load Average is Calculated

From the quotation provided or in various websites, “average” (in the context of Load Average) might appear to simply be an arithmetic average of values over a given period. In reality, Load Average in Linux is calculated using an exponential moving average (EMA) rather than a simple arithmetic average.

This approach gives recent system load changes greater priority compared to historical data. As a result, the value remains both sensitive and stable, making it especially useful for monitoring system performance.

The Load Average Formula

Zoom image will be displayed

where:

• is the new Load Average value.

• is the previous Load Average value.

• is the current number of processes in the run queue (running + waiting).

• is the time elapsed since the last update.

• (tau) is a time constant (different for the 1-, 5-, and 15-minute averages):

• 1 minute: seconds

• 5 minutes: seconds

• 15 minutes: seconds

Every second, the Linux kernel changes the Load Average with a smoothing formula. Each of the three metrics (1, 5, and 15 minutes) has its own decay coefficients.

Decay Factor for Updates

Zoom image will be displayed

This formula makes sudden changes in the system less sharp. For instance, when you start a heavy workload, the Load Average does not jump right up to the highest point but rises slowly instead.

Example Calculation

Let’s assume the current 1-minute LA is , and at time four processes appear in the queue.

The new Load Average is calculated as follows:

Zoom image will be displayed

…and so on.

The new Load Average does not go directly to 4; it rises slowly. This method allows the metric to show the current system state better. In math terms, the three values (1, 5, and 15 minutes) always average the total system load since the start. They decay in an exponential way but at different speeds — for 1, 5, and 15 minutes. Therefore, the 1-minute average has around 63% of the load from the last minute and 37% from earlier times, not counting the last minute. The same 63%/37% distribution holds for the 5 and 15-minute averages for their respective times. It’s not exactly true to say that the 1-minute average includes just the last 60 seconds of activity — it also has 37% from a more distant past. But it is accurate to say that it mainly shows the last minute.

Zoom image will be displayed

Practical Use

If Load Average < Number of cores, the system is running normally.
If Load Average ≈ Number of cores, the CPU is at 100% utilization.
If Load Average > Number of cores, processes are waiting for CPU time, indicating potential performance problems.

For example, on an 8-core server:

• LA(5) = 4 → CPU is about 50% utilized.

• LA(5) = 12 → CPU is overloaded; processes are waiting.

Why Load Average Is Better Than CPU Utilization Percentage?

Processes in the Queue. Load Average includes both active processes on the CPU and those waiting in the queue. CPU utilization percentage, however, only shows current CPU activity and ignores queued processes, which can lead to underestimating the real load.
2. Overall Load Measure. Load Average reflects all CPU cores. In systems with multiple processors or hyperthreading, it better represents the system’s load than CPU utilization, which may overlook the impact of queued processes.
3. I/O Factors. Load Average takes into account processes in “uninterruptible sleep” (like waiting for I/O). These processes can greatly impact performance even when not using CPU time, which you might miss if considering just CPU utilization.
4. Analyzing Trends. Load Average gives data for three timeframes (1, 5, and 15 minutes), making it simpler to observe changes over time and spot possible problems. In contrast, CPU utilization percentage offers only a brief view, lacking any trend analysis.

Zoom image will be displayed

About Hyperthreading

Hyperthreaded systems allow each physical processor to manage two instruction streams which results in the formation of logical cores. Linux treats these streams as separate cores. A computer system with 4 physical processor cores and hyperthreading capability will appear to have 8 logical cores. When a system reaches a Load Average of 8.0 it indicates full usage of all logical cores yet doesn’t guarantee performance to be twice as fast. Performance degradation occurs faster as Load Average increases in hyperthreaded systems compared to those without hyperthreading. Our next article will cover the operational details of hyperthreading and explore its benefits and drawbacks along with its overall effects.

Conclusion

Load Average provides richer insight into system performance through its extensive assessment of system load beyond basic CPU utilization measures. Load Average includes information about the current CPU workload and scheduled processes that are pending execution and tracks additional elements like I/O operations. Load Average proves to be an essential monitoring and management tool for Linux systems when dealing with heavy load or when working with servers that have multiple cores and hyperthreading capabilities.

The comprehensive historical analysis of this metric can be found in Brendan Gregg’s article and its previous Habr publication translation. Should you want to investigate the kernel code to understand this metric’s operation you are welcome to explore it directly!