AWS SnapStart - Part 16 Measuring cold and warm starts with Java 21 using different asynchronous HTTP clients

#java #aws #serverless #coldstart

Introduction

In the previous parts of our series we measured the cold starts of the Lambda function with Java 21 runtime without SnapStart enabled, with SnapStart enabled and also applied DynamoDB invocation priming optimization with different Lambda memory settings, Lambda deployment artifact sizes, Java compilation options. In the last part of the series we measured cold and warm starts with Java 21 using different synchronous HTTP clients.

In this article we'll measure the cold an warm starts with asynchronous HTTP Clients.

Measuring cold and warm starts with Java 21 using asynchronous HTTP clients

In our experiment we'll re-use the application introduced in part 9 for this and rewrite it to use asynchronous HTTP client. You can the find application code here. There are basically 2 Lambda functions which both respond to the API Gateway requests and retrieve product by id received from the API Gateway from DynamoDB. One Lambda function GetProductByIdWithPureJava21AsyncLambda can be used with and without SnapStart and the second one GetProductByIdWithPureJava21AsyncLambdaAndPriming uses SnapStart and DynamoDB request invocation priming. We give both Lambda functions 1024 MB memory.

There are 2 asynchronous HTTP Clients implementations available in the AWS SDK for Java.

NettyNioAsync (Default)
AWS CRT (asynchronous)

This is the order for the look up and set of asynchronous HTTP Client in the classpath.

Let's figure out how to configure such asynchronous HTTP Client. There are 2 places to do it : pom.xml and DynamoProductDao

Let's consider 2 scenarios:

Scenario 1) NettyNioAsync HTTP Client. It's configuration looks like this
In pom.xml the only enabled HTTP Client dependency has to be:

     <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>netty-nio-client</artifactId>
     </dependency>

In DynamoProductDao the DynamoDBAsyncClient should be created like this:

DynamoDbAsyncClient.builder()
    .credentialsProvider(DefaultCredentialsProvider.create())
    .region(Region.EU_CENTRAL_1)
     .httpClient(NettyNioAsyncHttpClient.create())
    .overrideConfiguration(ClientOverrideConfiguration.builder()
      .build())
    .build();

Scenario 2) AWS CRT synchronous HTTP Client. It's configuration looks like this
In pom.xml the only enabled HTTP Client dependency has to be:

     <dependency>
        <groupId>software.amazon.awssdk</groupId>
        <artifactId>aws-crt-client</artifactId>
     </dependency>

In DynamoProductDao the DynamoDBAsyncClient should be created like this:

DynamoDbAsyncClient.builder()
    .credentialsProvider(DefaultCredentialsProvider.create())
    .region(Region.EU_CENTRAL_1)
     .httpClient(AwsCrtAsyncHttpClient.create())
    .overrideConfiguration(ClientOverrideConfiguration.builder()
      .build())
    .build();

For the sake of simplicity we create all asynchronous HTTP Clients with their default settings. Of course, there is a potential to optimize there figuring out the right settings.

Using the asynchronous DynamoDBClient means that we'll be using the asynchronous programming model, so the invocation of getItem will return CompletableFuture and this is the code to retrieve the item itself (for the complete code see)

CompletableFuture<GetItemResponse> getItemReponseAsync = 
dynamoDbClient.getItem(GetItemRequest.builder().
key(Map.of("PK",AttributeValue.builder().
s(id).build())).tableName(PRODUCT_TABLE_NAME).build());
GetItemResponse getItemResponse = getItemReponseAsync.join();
if (getItemResponse.hasItem()) {
   return Optional.of(ProductMapper.productFromDynamoDB(getItemResponse.item()));
 } 
else {
   return Optional.empty();
}

The results of the experiment below were based on reproducing more than 100 cold and approximately 100.000 warm starts with experiment which ran for approximately 1 hour. For it (and experiments from my previous article) I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman. I ran all these experiments for all 2 scenarios using 2 different compilation options in template.yaml each:

no options (tiered compilation will take place)
JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling)

We found out in the article [Measuring cold and warm starts with Java 21 using different compilation options]https://dev.to/aws-builders/aws-snapstart-part-14-measuring-cold-and-warm-starts-with-java-21-using-different-compilation-options-el4) that with them we've got the lowest cold start times.

Let's look into the results of our measurements.

Cold and warm start time with compilation option "tiered compilation" without SnapStart enabled in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	3966.44	4063.52	4247.67	4868.51	5012.12	5194.04	6.51	7.63	9.34	23.54	70.39	2562.72
AWS CRT	2476.69	2587.15	2822.55	3372.14	3507.19	3660.32	5.55	6.30	7.51	20.41	68.19	1001.94

Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) without SnapStart enabled in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	3986.46	4112.98	4437.38	5494.57	5684.08	5824.35	6.41	7.51	9.23	23.54	70.39	2948.61
AWS CRT	2482.68	2529.01	2622.87	3199.55	3331.06	3478.24	5.92	6.72	8.26	22.09	75.00	949.13

Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled without Priming in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	2404.53	2487.64	2804.64	3001.89	3108.76	3110.82	6.61	7.63	9.53	25.09	179.57	2165.97
AWS CRT	1230.83	1306.90	1778.03	1968.86	1984.66	1986.02	5.64	6.41	7.87	21.40	670.53	1408.33

Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled without Priming in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	2411.75	2495.11	2867.00	3047.23	3127.46	3130.41	6.72	7.87	9.99	25.89	1765.78	2256.77
AWS CRT	1204.06	1263.23	1801.28	2008.61	2014.64	2015.81	5.82	6.61	8.13	21.75	670.53	1404.49

Cold and warm start time with compilation option "tiered compilation" with SnapStart enabled and with DynamoDB invocation Priming in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	724.66	774.08	1052.08	1219.81	1341.31	1341.76	6.61	7.63	9.53	23.92	99.81	265.97
AWS CRT	703.26	746.72	1148.80	1277.20	1321.35	1322.57	5.73	6.51	8.13	21.40	163.26	895.82

Cold and warm start time with compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) with SnapStart enabled and with DynamoDB invocation Priming in ms:

Scenario Number	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
NettyNioAsync	757.24	821.92	1064.77	1217.37	1225.92	1227.08	6.51	7.63	9.38	24.30	146.09	883.59
AWS CRT	518.47	630.67	899.28	1141.93	1168.49	1168.88	5.66	6.59	8.11	21.67	645.34	1200.68

Let's visualize our measurements for p90 and draw the conclusions. "SLA" is abbreviation for the compilation option "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" and "tiered" stays for the "tiered compilation".

Conclusion

Our measurements revealed that "tiered compilation" and "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client compilation without profiling) values are close enough.

In terms of the HTTP Client choice, AWS CRT Async HTTP Client outperformed the NettyNio Async HTTP client by far for the cold start and warm start times. The only one exception was SnapStart enabled and priming with tiered compilation where the both measurement results have been quite close.

Can we reduce the cold start a bit further? In the previous article Measuring cold and warm starts with Java 21 using synchronous HTTP clients in the "Conclusion" section we described how to reduce the deployment artifact size and therefore the cold start time for the AWS CRT synchronous HTTP Client. The same can also be applied for the asynchronous use case.

The choice of HTTP Client is not only about minimizing cold and warm starts. The decision is much more complex end also depends on the functionality of the HTTP Client implementation and its settings, like whether it supports HTTP/2. AWS publshed the decision tree which HTTP client to choose depending on the criteria.

In the next article of the series we'll explore the impact of the Fiecracker micro VM snapshot tiered caching over time on the reduction of the cold start.

Update on 06.06.2024. For the CRT client we can set classifier (i.e. linux-x86_64) in our POM file to only pick the relevant binary for our platform. See here. Big thanks to Maximilian Schellhorn for the hint!