DEV Community

Cover image for Turbocharging AWS Lambda: How to eliminate cold starts forever
Siddhant Khare
Siddhant Khare

Posted on

Turbocharging AWS Lambda: How to eliminate cold starts forever

Imagine rushing to grab your morning coffee, only to find the barista needs to boot up the espresso machine first. That's essentially what happens during a Lambda cold start – and just like your coffee delay, it can be frustrating. But fear not! Today we'll dive deep into how provisioned concurrency can help you serve up those functions piping hot.

Understanding the cold start problem

When a Lambda function hasn't been used recently, AWS needs to spin up a new execution environment before running your code. This initialization process includes:

  1. Downloading your code
  2. Bootstrapping the runtime
  3. Loading your dependencies
  4. Running initialization code

Here's a simple Node.js function that demonstrates the cold start impact:

const mongoose = require('mongoose');

// This connection happens during cold start
let conn = null;

const connectToDb = async () => {
    if (conn == null) {
        conn = await mongoose.connect(process.env.MONGODB_URI, {
            serverSelectionTimeoutMS: 5000
        });
    }
    return conn;
};

exports.handler = async (event) => {
    // Connection time will impact cold start duration
    await connectToDb();

    // Rest of your handler code...
};
Enter fullscreen mode Exit fullscreen mode

In my testing, this simple function with a database connection could take 800ms-2s to cold start, compared to 10-50ms for warm starts.

Enter provisioned concurrency

Provisioned concurrency is like having a barista who keeps the espresso machine running even during quiet periods. It maintains a pool of pre-initialized execution environments ready to respond instantly to incoming requests.

To enable it using AWS CDK:

const lambda = new lambda.Function(this, 'MyFunction', {
    runtime: lambda.Runtime.NODEJS_18_X,
    handler: 'index.handler',
    code: lambda.Code.fromAsset('lambda'),
    // Other configuration...
});

const version = lambda.currentVersion;
version.addProvisionedConcurrency(5); // Keep 5 instances warm
Enter fullscreen mode Exit fullscreen mode

The magic behind the scenes

When you enable provisioned concurrency, AWS does something clever:

  1. Creates the specified number of execution environments
  2. Runs your initialization code
  3. Freezes the environments in a ready state
  4. Maintains this pool, replacing any that become unhealthy

This means your function starts executing almost immediately when triggered, as the heavy lifting has already been done.

Best practices and optimization tips

1. Smart initialization code

Move as much initialization as possible into global scope:

// Good: Done once during provisioned concurrency initialization
const client = new AWS.DynamoDB.DocumentClient();
const tableName = process.env.TABLE_NAME;

// Bad: Would run on every invocation
exports.handler = async (event) => {
    const client = new AWS.DynamoDB.DocumentClient();
    // ...
};
Enter fullscreen mode Exit fullscreen mode

2. Precise concurrency levels

Monitor your function's concurrent executions using CloudWatch metrics like ConcurrentExecutions and adjust provisioned concurrency accordingly. Over-provisioning wastes money, while under-provisioning leads to cold starts.

3. Using application auto-scaling

Set up auto-scaling to automatically adjust provisioned concurrency based on utilization:

const target = version.addAutoScaling({
    minCapacity: 2,
    maxCapacity: 10
});

target.scaleOnUtilization({
    utilizationTarget: 0.75,
    scaleInCooldown: Duration.seconds(60),
    scaleOutCooldown: Duration.seconds(60)
});
Enter fullscreen mode Exit fullscreen mode

Cost considerations

Provisioned concurrency isn't free – you pay for:

  • The time your provisioned instances are available
  • The compute time used during function execution
  • Any additional instances that spin up beyond your provisioned amount

A practical approach is to:

  1. Identify functions that are latency-sensitive
  2. Monitor their actual usage patterns
  3. Apply provisioned concurrency selectively
  4. Use scheduled provisioned concurrency for predictable load patterns

Measuring the impact

I built a simple testing framework to measure the difference:

import concurrent.futures
import requests
import time
import statistics

def invoke_function(url):
    start = time.time()
    response = requests.post(url)
    return time.time() - start

def run_load_test(url, concurrent_requests):
    with concurrent.futures.ThreadPoolExecutor(max_workers=concurrent_requests) as executor:
        futures = [executor.submit(invoke_function, url) for _ in range(concurrent_requests)]
        times = [f.result() for f in concurrent.futures.as_completed(futures)]

    return {
        'avg': statistics.mean(times),
        'p95': statistics.quantile(times, 0.95),
        'max': max(times)
    }

# Results with provisioned concurrency disabled
print("Without PC:", run_load_test(lambda_url, 100))

# Results with provisioned concurrency enabled
print("With PC:", run_load_test(lambda_url, 100))
Enter fullscreen mode Exit fullscreen mode

Real-world results

In production environments, I've seen:

  • Cold starts reduced from 1-2s to under 100ms
  • P95 latency improved by 80%
  • More consistent performance during traffic spikes

Beyond provisioned concurrency

While provisioned concurrency is powerful, consider these complementary strategies:

  • Using smaller dependencies to reduce initialization time
  • Implementing connection pooling for databases
  • Leveraging Lambda SnapStart for Java functions
  • Using external caching services for frequently accessed data

Conclusion

Provisioned concurrency is a powerful tool for reducing Lambda cold starts, but it requires careful planning and monitoring to be used effectively.

Remember: like that perfect cup of coffee, the key is finding the right balance – between performance, cost, and complexity – that works for your specific use case.


For more tips and insights, follow me on Twitter @Siddhant_K_code and stay updated with the latest & detailed tech content like this.

Top comments (0)