Vadym Kazulkin for AWS Heroes

Posted on Mar 31 • Edited on Jun 29 • Originally published at vkazulkin.com

Serverless applications on AWS with Lambda using Java 25, API Gateway and Aurora DSQL - Part 3 Introducing Lambda SnapStart

#aws #java #serverless #awslambda

Introduction

In part 1, we introduced our sample application. In part 2, we measured the performance (cold and warm start times) of the Lambda function without any optimizations. We observed quite a large cold start time, especially if we use the Hibernate ORM framework. Using this framework also significantly increases the artifact size. In this article, we'll introduce AWS Lambda SnapStart as one of the approaches to reducing the cold start times of the Lambda function. We'll also provide the cold and warm start measurements of the sample application when SnapStart is enabled for the Lambda function.

AWS Lambda SnapStart

As we saw in part 2, without any optimizations, Lambda performance measurements showed quite high values, especially for the cold start times. The article Understanding the Lambda execution environment lifecycle provides a good overview of this topic. Lambda SnapStart is one of the optimization approaches to reduce the cold start times.

Lambda SnapStart can provide a start time of a Lambda function of less than one second. SnapStart simplifies the development of responsive and scalable applications without provisioning resources or implementing complex performance optimizations.

The largest portion of startup latency (often referred to as cold start time) is the time Lambda spends initializing the function. This includes loading the function code, starting the runtime, and initializing the function code. With SnapStart, Lambda initializes our function when we publish a function version. Lambda takes a Firecracker microVM snapshot of the memory and disk state of the initialized execution environment. Then it encrypts the snapshot and intelligently caches it to optimize retrieval latency.

To ensure reliability, Lambda manages multiple copies of each snapshot. Lambda automatically patches snapshots and their copies with the latest runtime and security updates. When we invoke the function version, Lambda restores a new execution environment from the cached snapshot. This happens instead of initializing it from scratch, which improves startup latency. More information can be found in the article Reducing Java cold starts on AWS Lambda functions with SnapStart. You can find more information about the Lambda SnapStart in the article Under the hood: how AWS Lambda SnapStart optimizes function startup latency. I have published the whole series about Lambda SnapStart for Java applications.

Measurements of cold and warm start times of the Lambda function of the sample application with JDBC and Hikari connection pool

We'll reuse the sample application from part 1 and do exactly the same performance measurement as we described in part 2. We'll measure the performance of the GetProductByIdJava25WithDSQL Lambda function mapped to the GetProductByIdHandler. We will trigger it by invoking curl -H "X-API-Key: a6ZbcDefQW12BN56WEDQ25" https://{$API_GATEWAY_URL}/prod/products/1 .

One important aspect is that we instantiate Jackson ObjectMapper and ProductDao directly in the static initializer block of the GetProductByIdHandler Lambda function:

public class GetProductByIdHandler
        implements RequestHandler<APIGatewayProxyRequestEvent, APIGatewayProxyResponseEvent> {

private static final ObjectMapper objectMapper = new ObjectMapper();
private static final ProductDao productDao= new ProductDao();
...

When you create an ObjectMapper for the first time, it initializes a lot of other classes. As part of this process, it instantiates a lot of singletons. It takes, depending on the hardware, more than a hundred milliseconds. If you create the second ObjectMapper in the same Java process, it takes only 1 millisecond because all the singletons are already there. By moving the ObjectMapper instantiation to the static initializer block of the Lambda function, we decrease the cold start time. The reason for that is that this initialized object becomes a part of the SnapStart snapshot.

The same is true for ProductDao, especially taking into account that we directly preinitialize DsqlDataSourceConfig there:

public class ProductDao {

private static final DsqlDataSourceConfig dsqlDataSourceConfig=new DsqlDataSourceConfig();

....

This, in turn, loads a lot of classes and creates the Hikari Data Source. Moreover, it creates the Hikari connection pool as well. The part of the process is to search for the available JDBC driver. In our case, the PostgreSQL database driver will be found and loaded. Then the initialization of the database connection to Aurora DSQL happens, and this connection is added to the connection pool. We configured the pool size to be 1, because this is enough for the single-thread Lambda function. That's why exactly one database connection will be created. With that, the database connection is ready to be reused. All of these become a part of the SnapStart snapshot:

public class DsqlDataSourceConfig {

private static final String AURORA_DSQL_CLUSTER_ENDPOINT = System.getenv("AURORA_DSQL_CLUSTER_ENDPOINT");

private static final String JDBC_URL = "jdbc:aws-dsql:postgresql://"
    + AURORA_DSQL_CLUSTER_ENDPOINT
     + ":5432/postgres?sslmode=verify-full&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory"
 + "&token-duration-secs=900";


private static HikariDataSource hds;
static {    
   var config = new HikariConfig();
   config.setUsername("admin");
   config.setJdbcUrl(JDBC_URL);
   config.setMaxLifetime(1500 * 1000); // pool connection expiration time in milli seconds, default 30
    config.setMaximumPoolSize(1); // default is 10
    hds = new HikariDataSource(config);
}

Also, please make sure that we have enabled Lambda SnapStart in template.yaml as shown below:

Globals:
  Function:
    Handler: 
    SnapStart:
      ApplyOn: PublishedVersions 
    ....
    Environment:
      Variables:
        JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1"

Please note that I measured only the performance of the Lambda function. On top of that comes also the latency of the trigger - in our case, the API Gateway REST API.

Please also note the effect of the Lambda SnapStart snapshot tiered cache. This means that in the case of SnapStart activation, we get the largest cold starts during the first measurements. Due to the tiered cache, the subsequent cold starts will have lower values. For more details about the technical implementation of AWS SnapStart and its tiered cache, I refer you to the presentation by Mike Danilov: "AWS Lambda Under the Hood". Please also read the already mentioned article Under the hood: how AWS Lambda SnapStart optimizes function startup latency. Therefore, I will present the Lambda performance measurements with SnapStart being activated for 2 cases:

For all approximately 100 cold start times (labelled as all in the table)
For the last approximately 70 (labelled as last 70 in the table). With that, the effect of the snapshot tiered cache becomes visible to you. Depending on how often the respective Lambda function is updated and thus some layers of the cache are invalidated, a Lambda function can experience thousands or tens of thousands of cold starts during its life cycle, so that the first longer-lasting cold starts no longer carry much weight.

To show the impact of SnapStart, we'll also present the Lambda performance measurements without SnapStart being activated from part 2.

I did the measurements with java:25.v19 Amazon Corretto version, and the deployed artifact size of this application was 17.150 KB.

Cold (c) and warm (w) start time with -XX:+TieredCompilation -XX:TieredStopAtLevel=1 compilation in ms:

Approach	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
No SnapStart enabled	2336	2453	2827	3026	3131	3132	4.84	5.29	5.73	8.88	195.38	531
SnapStart enabled but no priming applied, all	970	1058	1705	1726	1734	1735	4.92	5.33	5.86	9.84	198.52	1134
SnapStart enabled but no priming applied, last 70	901	960	1061	1212	1212	1212	4.84	5.29	5.77	9.54	196.94	719

Measurements of cold and warm start times of the Lambda function of the sample application with Hibernate and Hikari connection pool

The same as mentioned above holds true for the sample application from part 1, using Hibernate instead of JDBC. We will measure the performance of our GetProductByIdJava25WithHibernateAndDSQL Lambda function mapped to the GetProductByIdHandler. We will trigger it by invoking curl -H "X-API-Key: a6ZbcDefQW12BN56WEHADQ25" https://{$API_GATEWAY_URL}/prod/products/1.

The most important difference is that in the ProductDao, we create the Hibernate Session Factory

public class ProductDao {
  private static final SessionFactory sessionFactory=   HibernateUtils.getSessionFactory();
....

In HibernateUtils, we set the same Hikari connection pool properties as in the example above. We then pass those properties to the Hibernate configuration along with the classes annotated as entities. The final part is to build a Hibernate session factory.

public final class HibernateUtils {

private static final String AURORA_DSQL_CLUSTER_ENDPOINT = System.getenv("AURORA_DSQL_CLUSTER_ENDPOINT");

private static final String JDBC_URL = "jdbc:aws-dsql:postgresql://"
   + AURORA_DSQL_CLUSTER_ENDPOINT
   + ":5432/postgres?sslmode=verify-full&sslfactory=org.postgresql.ssl.DefaultJavaSSLFactory"
   + "&token-duration-secs=900";

private static SessionFactory sessionFactory= getHibernateSessionFactory();

private HibernateUtils () {}

private static SessionFactory getHibernateSessionFactory () {           
   var settings = new Properties();
   settings.put("jakarta.persistence.jdbc.user", "admin");
   settings.put("jakarta.persistence.jdbc.url", JDBC_URL);
   settings.put("hibernate.connection.pool_size", 1);
   settings.put("hibernate.hikari.maxLifetime", 1500 * 1000);
   return new Configuration()
     .setProperties(settings)
     .addAnnotatedClass(Product.class)
     .buildSessionFactory();    
}

public static SessionFactory getSessionFactory() {
     return sessionFactory;
}
...

All these steps involve a lot of class loading and preinitialization: Hikari Data Source, Hibernate Configuration, and Session Factory. Moreover, it creates the Hikari connection pool as well. Part of the process is to search for the available JDBC driver. In our case, the PostgreSQL database driver will be found and loaded. Then the initialization of the database connection to Aurora DSQL happens, and this connection is added to the connection pool. We configured the pool size to be 1, because this is enough for the single-thread Lambda function. That's why exactly one database connection will be created. With that, the database connection is ready to be reused. All of these become a part of the SnapStart snapshot.

I did the measurements with java:25.v19 Amazon Corretto version, and the deployed artifact size of this application was 42.333 KB.

Cold (c) and warm (w) start time with -XX:+TieredCompilation -XX:TieredStopAtLevel=1 compilation in ms:

Approach	c p50	c p75	c p90	c p99	c p99.9	c max	w p50	w p75	w p90	w p99	w p99.9	w max
No SnapStart enabled	6243	6625	7056	8480	8651	8658	5.46	5.96	6.50	9.77	200.10	707
SnapStart enabled but no priming applied, all	1277	1360	3050	3103	3200	3201	5.50	6.01	6.45	10.16	196.94	2349
SnapStart enabled but no priming applied, last 70	1258	1320	1437	1634	1634	1634	5.42	5.91	6.40	10.08	195.94	1093

Conclusion

In this article of the series, we introduced AWS Lambda SnapStart as one of the approaches to reduce the cold start times of the Lambda function. We observed that by enabling SnapStart on the Lambda function, the cold start time goes down significantly for both sample applications. It's especially noticeable when looking at the "last 70" measurements with the snapshot tiered cache effect. The biggest impact of just enabling SnapStart is on the application using Hibernate. But still, the cold start remains quite high. In the next article, we'll explore the first Lambda SnapStart priming technique. I call it the database (in our case, Aurora DSQL) request priming. The goal of applying priming is to preload and preinitialize as much as possible in the SnapStart snapshot during the deployment phase. With that, all those things will already be available directly after the SnapStart snapshot restore.

Please also watch out for another series where I use a NoSQL serverless Amazon DynamoDB database instead of Aurora DSQL to do the same Lambda performance measurements.

If you like my content, please follow me on GitHub and give my repositories a star!

Please also check out my website for more technical content and upcoming public speaking activities.

DEV Community