Introduction
In the previous parts of this series, we measured the cold starts of the Lambda function with Java 21 runtime without SnapStart enabled, with SnapStart enabled, and also applied DynamoDB invocation priming optimization with different Lambda memory settings, Lambda deployment artifact sizes, and Java compilation options. In the last part of the series, we measured cold and warm starts with Java 21 using different synchronous HTTP clients.
In this article, we'll measure the cold and warm starts with asynchronous HTTP Clients.
Measuring cold and warm starts with Java 21 using asynchronous HTTP clients
In our experiment, we'll reuse the application introduced in part 9, and rewrite it to use an asynchronous HTTP client. You can find the application code here. There are basically 2 Lambda functions, which both respond to the API Gateway requests and retrieve the product by its id, received from the API Gateway from DynamoDB. One Lambda function, GetProductByIdWithPureJava21AsyncLambda, can be used with and without SnapStart, and the second one, GetProductByIdWithPureJava21AsyncLambdaAndPriming, uses SnapStart and DynamoDB request invocation priming. We give both Lambda functions 1024 MB of memory.
There are 2 asynchronous HTTP Client implementations available in the AWS SDK for Java.
- NettyNioAsync (Default)
- AWS CRT (asynchronous)
This is the order for the lookup of the asynchronous HTTP Client in the classpath.
Let's figure out how to configure such an asynchronous HTTP Client. There are 2 places to do it : pom.xml and DynamoProductDao
Let's consider 2 scenarios:
Scenario 1) NettyNioAsync HTTP Client. Its configuration looks like this
In pom.xml, the only enabled HTTP Client dependency has to be:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>netty-nio-client</artifactId>
</dependency>
In DynamoProductDao, the DynamoDBAsyncClient should be created like this:
DynamoDbAsyncClient.builder()
.credentialsProvider(DefaultCredentialsProvider.create())
.region(Region.EU_CENTRAL_1)
.httpClient(NettyNioAsyncHttpClient.create())
.overrideConfiguration(ClientOverrideConfiguration.builder()
.build())
.build();
Scenario 2) AWS CRT synchronous HTTP Client. Its configuration looks like this
In pom.xml, the only enabled HTTP Client dependency has to be:
<dependency>
<groupId>software.amazon.awssdk</groupId>
<artifactId>aws-crt-client</artifactId>
</dependency>
In DynamoProductDao, the DynamoDBAsyncClient should be created like this:
DynamoDbAsyncClient.builder()
.credentialsProvider(DefaultCredentialsProvider.create())
.region(Region.EU_CENTRAL_1)
.httpClient(AwsCrtAsyncHttpClient.create())
.overrideConfiguration(ClientOverrideConfiguration.builder()
.build())
.build();
For the sake of simplicity, we create all asynchronous HTTP Clients with their default settings. Of course, there is a potential to optimize there figuring out the right settings.
Using the asynchronous DynamoDBClient means that we'll be using the asynchronous programming model, so the invocation of getItem will return CompletableFuture, and this is the code to retrieve the item itself (for the complete code see)
CompletableFuture<GetItemResponse> getItemReponseAsync =
dynamoDbClient.getItem(GetItemRequest.builder().
key(Map.of("PK",AttributeValue.builder().
s(id).build())).tableName(PRODUCT_TABLE_NAME).build());
GetItemResponse getItemResponse = getItemReponseAsync.join();
if (getItemResponse.hasItem()) {
return Optional.of(ProductMapper.productFromDynamoDB(getItemResponse.item()));
}
else {
return Optional.empty();
}
The results of the experiment below were based on reproducing more than 100 cold and approximately 100.000 warm starts with the experiment, which ran for approximately 1 hour. For it (and experiments from my previous article), I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman. I ran all these experiments for all 2 scenarios using 2 different compilation options in the template.yaml each:
- No options (tiered compilation will take place)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling)
We found out in the article [Measuring cold and warm starts with Java 21 using different compilation options]https://dev.to/aws-builders/aws-snapstart-part-14-measuring-cold-and-warm-starts-with-java-21-using-different-compilation-options-el4) that with them we've got the lowest cold start times.
Let's look into the results of our measurements.
Cold and warm start time with compilation option "tiered compilation" without SnapStart enabled in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 3966.44 | 4063.52 | 4247.67 | 4868.51 | 5012.12 | 5194.04 | 6.51 | 7.63 | 9.34 | 23.54 | 70.39 | 2562.72 |
| AWS CRT | 2476.69 | 2587.15 | 2822.55 | 3372.14 | 3507.19 | 3660.32 | 5.55 | 6.30 | 7.51 | 20.41 | 68.19 | 1001.94 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) without SnapStart enabled in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 3986.46 | 4112.98 | 4437.38 | 5494.57 | 5684.08 | 5824.35 | 6.41 | 7.51 | 9.23 | 23.54 | 70.39 | 2948.61 |
| AWS CRT | 2482.68 | 2529.01 | 2622.87 | 3199.55 | 3331.06 | 3478.24 | 5.92 | 6.72 | 8.26 | 22.09 | 75.00 | 949.13 |
Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled without Priming in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 2404.53 | 2487.64 | 2804.64 | 3001.89 | 3108.76 | 3110.82 | 6.61 | 7.63 | 9.53 | 25.09 | 179.57 | 2165.97 |
| AWS CRT | 1230.83 | 1306.90 | 1778.03 | 1968.86 | 1984.66 | 1986.02 | 5.64 | 6.41 | 7.87 | 21.40 | 670.53 | 1408.33 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled without Priming in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 2411.75 | 2495.11 | 2867.00 | 3047.23 | 3127.46 | 3130.41 | 6.72 | 7.87 | 9.99 | 25.89 | 1765.78 | 2256.77 |
| AWS CRT | 1204.06 | 1263.23 | 1801.28 | 2008.61 | 2014.64 | 2015.81 | 5.82 | 6.61 | 8.13 | 21.75 | 670.53 | 1404.49 |
Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled and with DynamoDB invocation Priming in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 724.66 | 774.08 | 1052.08 | 1219.81 | 1341.31 | 1341.76 | 6.61 | 7.63 | 9.53 | 23.92 | 99.81 | 265.97 |
| AWS CRT | 703.26 | 746.72 | 1148.80 | 1277.20 | 1321.35 | 1322.57 | 5.73 | 6.51 | 8.13 | 21.40 | 163.26 | 895.82 |
Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled and with DynamoDB invocation Priming in ms:
| Scenario Number | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| NettyNioAsync | 757.24 | 821.92 | 1064.77 | 1217.37 | 1225.92 | 1227.08 | 6.51 | 7.63 | 9.38 | 24.30 | 146.09 | 883.59 |
| AWS CRT | 518.47 | 630.67 | 899.28 | 1141.93 | 1168.49 | 1168.88 | 5.66 | 6.59 | 8.11 | 21.67 | 645.34 | 1200.68 |
Let's visualize our measurements for p90 and draw the conclusions. "SLA" is abbreviation for the compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" and "tiered" stays for the "tiered compilation".
Conclusion
Our measurements revealed that "tiered compilation" and "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) values are close enough.
In terms of the HTTP Client choice, AWS CRT Async HTTP Client outperformed the NettyNio Async HTTP Client by far for the cold start and warm start times. The only exception was SnapStart enabled and priming with tiered compilation, where both measurement results have been quite close.
Can we reduce the cold start a bit further? In the previous article Measuring cold and warm starts with Java 21 using synchronous HTTP clients , in the "Conclusion" section, we described how to reduce the deployment artifact size and therefore the cold start time for the AWS CRT synchronous HTTP Client. The same can also be applied to the asynchronous use case.
The choice of HTTP Client is not only about minimizing cold and warm starts. The decision is much more complex and also depends on the functionality of the HTTP Client implementation and its settings, like whether it supports HTTP/2. AWS published the decision tree, which HTTP client to choose depending on the criteria.
In the next article of the series, we'll explore the impact of the Fiecracker micro VM snapshot tiered caching over time on the reduction of the cold start.
Update on 06.06.2024. For the CRT client, we can set the classifier (i.e., linux-x86_64) in our POM file to only pick the relevant binary for our platform. See here. Big thanks to Maximilian Schellhorn for the hint!


Top comments (0)