---
title: "Cold Start Elimination in Serverless Kotlin: SnapStart, CRaC, and the AppCDS Pipeline"
published: true
description: "A hands-on guide to getting your Kotlin Lambda under 200ms cold starts using AWS SnapStart, CRaC, and GraalVM native image — with the Kotlin-specific pitfalls that will save you hours."
tags: kotlin, serverless, cloud, architecture
canonical_url: https://blog.mvpfactory.co/cold-start-elimination-serverless-kotlin
---
## What We Are Building
By the end of this tutorial, you will understand three competing approaches to eliminating JVM cold starts in Kotlin Lambdas — AWS SnapStart, CRaC, and GraalVM native image — and you will have a working AppCDS pipeline that consistently lands your cold starts under 200ms. Let me show you what works, what breaks, and the pattern I use in every production serverless project.
## Prerequisites
- A Kotlin-based AWS Lambda function (JVM runtime)
- Familiarity with Gradle or Maven builds
- Basic understanding of coroutines and `lazy` delegates
- AWS CLI configured for Lambda deployments
## Step 1: Understand Where the Time Goes
A vanilla Kotlin Lambda on a standard JVM runtime routinely hits 3-6 seconds on cold start. Here is the minimal breakdown to internalize:
| Phase | Typical duration |
|---|---|
| Container init + JVM bootstrap | ~800-1500ms |
| Class loading | ~1000-2500ms |
| Dependency injection / framework init | ~500-2000ms |
| Handler first invocation | ~100-300ms |
Class loading and framework initialization dominate. Every approach below attacks those two phases differently.
## Step 2: Pick Your Approach
| Factor | SnapStart | CRaC | GraalVM native |
|---|---|---|---|
| Cold start (typical) | 200-400ms | 150-350ms | 50-150ms |
| Build complexity | Low | Medium | High |
| Kotlin coroutines support | Partial (gotchas) | Partial (needs hooks) | Limited (reflection config) |
| Framework compatibility | Broad | Moderate | Narrow |
| Memory footprint | Standard JVM | Standard JVM | 50-70% reduction |
| AWS Lambda support | Native | Custom runtime | Custom runtime |
**SnapStart** takes a Firecracker microVM snapshot after your Lambda's `init` phase completes — classes loaded, singletons initialized, connection pools warmed. Cold starts drop to the 200-400ms range.
**CRaC** gives your application explicit lifecycle hooks (`beforeCheckpoint` / `afterRestore`). You get more control, but you own the orchestration. It requires a compatible JDK build (Azul Zulu with CRaC support, or the upstream OpenJDK CRaC branch). Expect 150-350ms cold starts.
**GraalVM native image** eliminates the JVM entirely with ahead-of-time compilation. Sub-100ms cold starts are achievable, but reflection-heavy frameworks need extensive configuration and build times can exceed 5 minutes. I'd only reach for this when the other two options genuinely aren't fast enough.
## Step 3: Wire Up the AppCDS Pipeline
Here is the minimal setup to get this working. Application Class Data Sharing generates a shared archive of pre-parsed class metadata. Combined with SnapStart, it eliminates the class loading phase almost entirely.
bash
Step 1: Run with tracing to capture loaded classes
java -XX:DumpLoadedClassList=classes.lst -jar app.jar
Step 2: Generate the CDS archive
java -Xshare:dump -XX:SharedClassListFile=classes.lst \
-XX:SharedArchiveFile=app-cds.jsa -jar app.jar
Step 3: Run with the archive
java -Xshare:on -XX:SharedArchiveFile=app-cds.jsa -jar app.jar
The docs do not mention this, but the pragmatic CI move is to generate the archive in a dedicated stage that only runs when dependencies change — not on every commit. Cache the `.jsa` file as a build artifact.
## Gotchas
Here is the section that will save you hours. Most teams underestimate how Kotlin's idioms interact with checkpoint/restore.
### 1. `lazy` Delegates Restore Stale State
kotlin
val config: Config by lazy { loadFromSSM() } // Loaded at init, frozen in snapshot
After a SnapStart or CRaC restore, that `lazy` value is already initialized — with credentials or config that may have rotated since the snapshot was taken. Use a `ResettableLazy` wrapper or, for CRaC, implement `Resource` to invalidate lazy holders in `afterRestore`.
### 2. Coroutine Dispatcher Pools Die on Restore
`Dispatchers.Default` and `Dispatchers.IO` maintain thread pools that don't survive a checkpoint cleanly. After restore, threads may be in an undefined state and coroutines silently hang.
kotlin
// Before checkpoint: warm pool of 64 IO threads
// After restore: pool references dead threads
val result = withContext(Dispatchers.IO) {
// May hang indefinitely
fetchData()
}
Reinitialize dispatchers post-restore, or use a custom `CoroutineDispatcher` backed by a fresh executor created in `afterRestore`. Neither option is pretty, but a silently hanging Lambda is worse.
### 3. Kotlin Serialization and Reflection Caches
`kotlinx.serialization` builds internal caches of serializer lookups. GraalVM native image needs these registered at build time. Miss one, and you get a runtime `ClassNotFoundException` that only appears in production under specific payload shapes — passes every test, explodes on the first unusual request.
## Conclusion
Start with **SnapStart + AppCDS** if you're on AWS Lambda. It's the lowest-effort path to sub-300ms cold starts with zero custom runtime work. But audit every `lazy` delegate and singleton for stale state — this is the part people skip and regret.
Move to **CRaC** if your Kotlin service manages connection pools, coroutine dispatchers, or cached credentials. The explicit lifecycle hooks prevent the class of bugs SnapStart introduces silently.
Reserve **GraalVM native image** for cold-start-critical, framework-light functions where you absolutely need sub-100ms and can commit to maintaining reflection configuration.
The JVM cold start problem isn't unsolvable — it's an engineering tradeoff. Pick the approach that matches your team's operational maturity, not the one with the most impressive benchmark slide.
Top comments (0)