The introductory article defined the performance envelope, and this one looks at how changing the performance envelope for a CPU-heavy application affects its performance.
- The endpoint under test
- Running tests
- Results
- Can a CPU-heavy application perform better with more CPU resources?
The endpoint under test
Our mock application is built in Flask and has several REST API endpoints, one of which is:
-
/cpu_intensive
, simulating a CPU-intensive task.
When this endpoint is invoked, the application calculates the square root of 64 * 64 * 64 * 64 * 64 * 64 ** 64
and returns the result.
$ http http://ALB.eu-central-1.elb.amazonaws.com/cpu_intensive
HTTP/1.1 200 OK
Connection: close
Content-Length: 36
Content-Type: application/json
Date: Sun, 26 Nov 2023 15:52:46 GMT
Server: Werkzeug/2.3.7 Python/3.9.6
{
"result": "2.0568806966515076e+62"
}
Running tests
It is better to be roughly right than precisely wrong.
— Alan Greenspan
To load-test this application, I used hey to invoke the endpoint with 5 requests per second for 30 minutes using hey -z 30m -q 1 -c 5 $URL/cpu_intensive
To be able to compare results, I ran the same application in three containers, each with different hardware constraints:
CPUs | Memory (GB) | |
---|---|---|
Container 1 | 0.25 | 0.5 |
Container 2 | 0.5 | 1.0 |
Container 3 | 1.0 | 2.0 |
Results
Container 1 (0.25CPU)
As expected, Container 1 performed the worst, averaging 3.13 requests per second. Containers 2 and 3 were both able to serve 4.99 requests per second.
One of the graphs from Nathan's article shows a CPU load peaking and staying at 100% for the duration of the load test. I was able to achieve the same results with container 1 in my test.
Container 1 clearly on its knees with average CPU utilization at 100% for the duration of the test:
In this graph you can see CPU and memory utilization over time as the load test ramps up. The CPU metric is much higher than the memory metric, and it flattens out around 100%.
This means that the application ran out of CPU resource first. The workload is primarily CPU bound. This is quite normal, as most workloads run out of CPU before they run out of memory. As the application runs out of CPU, the quality of the service suffers before it actually runs out of memory.
This tells us one micro optimization we might be able to make, is to modify the performance envelope to add a bit more CPU and a bit less memory. Source
Container 2 (0.5CPU)
Container 2 has the double amount of CPU and delivers the expected performance of 5 requests per second with an average CPU utilization around 90%:
Container 3 (1CPU)
Doubling the amount of CPU again, container 3 delivers the expected performance with average CPU utilization around 35%:
We could even say that container 3, with 1CPU and 2GB of memory is over provisioned. In dollar amounts, it would cost $41 to run per month. On the other hand, container 2 would cost $20 while delivering the same baseline performance of 5 requests per second.
Can a CPU-heavy application perform better with more CPU resources?
As expected, yes. Increasing the amount of CPU from 0.25 to 0.5 allows the application container to deliver the expected performance of 5 requests per second while doing a CPU-heavy calculation.
Going from 0.5CPU to 1CPU doesn't add any measurable benefit at 5 requests per second, but it would allow the application to respond more quickly and scale to more requests per second.
Looking at hey
's output in more detail, we can see that container 3 had response times that are almost 3 times faster that those from container 2.
CPUs | Memory (GB) | Requests/sec | Avg. response time (sec) | |
---|---|---|---|---|
Container 1 | 0.25 | 0.5 | 3.1384 | 1.5909 |
Container 2 | 0.5 | 1.0 | 4.9974 | 0.8514 |
Container 3 | 1.0 | 2.0 | 4.9990 | 0.3217 |
The end goal of all this load testing and metric analysis is to define an expected performance envelope that fits your application needs. Ideally it should also provide a little bit of extra space for occasional bursts of activity. Source
Container 2, with 0.5CPU and 1GB of memory provides just that. Vertically scaling a CPU-heavy applications results in increased performance.
Next up: Let's look at how vertically scaling an application with a memory leak goes. ☠️
Top comments (0)