Renato Byrro

Posted on Aug 20, 2019

6 Keys to Managing Database Load in AWS Lambda

#serverless #lambda #database #scalability

Database connections can sometimes be overlooked and done wrong. For serverless developers, we just wish there was a way as simple as Lambda itself for DB connection management.

First things first

Since a serverless architecture is quite different from what we were used to, there are some important considerations to take.

The first thing we need to consider is scalability. By default, AWS Lambda can scale to 1,000 concurrent requests in a matter of milliseconds. And it’s possible to lift this limit by asking AWS. This has important impacts on DB connection management, for example.

Stretch or jam

Is our database infrastructure capable of coping with this demand? What happens if DB becomes a bottleneck, how will our application behave? Is it going to lose data, or does it have some sort of "emergency plan" coded in its logic? Does it respond to end-user requests gracefully in case of DB failure?

Think of Lambda as a wide highway capable of absorbing huge amounts of traffic, from cars to heavy trucks to buses. We shouldn’t redirect all this traffic into a narrow neighborhood road, it will cause major traffic jams.

It can be so bad that it falls into gridlock state and no one can get home, like in the intersection below.

This could happen in the event of DB errors triggering Lambda auto-retries, which can escalate in a bad way.

It is important to know how much load our DB can handle and monitor Lambda closely. AWS CloudWatch metrics can be very helpful. For professional apps running in production, third party monitoring services can come in handy for more in-depth metrics and tight control, such as Dashbird, DataDog, and New Relic.

One size fits… only one!

There's no one-size-fits-all here. The way we approach the alignment of scalability between Lambda and databases will depend on the application and the storage system in place.

Consider an app that has a large number of users generating small chunks of data occasionally and a fewer number of big customers that are eagerly generating a sheer amount of data every second.

We set up our database infrastructure for distributed read/write requests and seamless horizontal scaling. Nice and sweet.

The database may be split across multiple servers (call it partitions), but we may not be able to distribute the load evenly across all of them. Let’s say we distribute load across partitions based on the customer. Customers 1 to 10 will end up in Partition-1, Customers 11 to 20 in Partition-2, etc (oversimplification for illustration purposes).

When a big customer generates data to be processed, everything will be funneled into a single partition, which may get overburdened.

While designing our processing and storing logic, we must take into account that the distribution of data per customer is skewed. The easiest way would be to allocate capacity per customer in advance.

For instance, if a given customer manifests it needs up to X TB of data per second, we can calculate in advance how many servers will be needed to accommodate the demand and set up dedicated partitions.

Eternity is not in DB connection vocabulary

Every DB server will have some sort of limit into how many connections it can have opened at any single point in time. Having idle connections opened for eternity is obviously a bad idea since connection slots are scarce.

In a traditional infrastructure, where we have one or multiple servers, each handling a considerable portion of the load, we can distribute a certain number of connections in each server, which are shared to serve multiple requesters.

Since our servers have a long lifespan and we are in full control, we can more easily manage the DB connections in a performant way.

In Lambda, things are quite different. We have very limited control over container management, not to mention the underlying servers running them. It's even impossible to control whether two Lambda instances are running on the same server.

If we're following the Lambda distributed model appropriately, each Lambda instance will be taking care of a single request. This means: inside a given Lambda container, we can’t have a pool of DB connections to share among a multitude of requesters.

One might say: "I can open a connection outside my Lambda function handler and leave it open for the next invocations".

It sounds interesting. We reduce overall latency by reusing connections. But that is a bad idea. Two questions: if we open connections outside the Lambda function handler, how do we control...

...when and how connections are closed?
...how many connections are opened and where they’re being used?

The answer is: we don't. And that's not a good way of managing DB connections.

A Lambda container might remain idle for several minutes, up to a few hours after serving an invocation. When an invocation comes in and a DB connection is opened outside the handler, it will remain open with the container for a variable period of time, and we can’t enforce a performant way of reusing that connection.

The best practice for Lambda is to open connections inside the function handler and close them as soon as the job is done, before terminating the invocation. Opening a connection at every invocation adds up latency, but there’s no better way of managing it.

When timeout does not timeout

One of the Lambda limitations is the execution time. Say a function is limited to 10 seconds. You open a DB connection, but for some reason, the job is taking a lot longer than normal to process. Ten seconds pass and the execution is halted due to the Lambda limit, and the DB connection may remain open.

The Lambda timeout will not immediately and automatically timeout the DB connection!

Now consider the issue persists and the timeout occurs frequently for a given period of time. The database server may get quickly overburdened with multiple idle connections, making it difficult or even impossible for other Lambda invocations to access it until those connections are closed.

To avoid this issue, we could use the Lambda context information to monitor the number of milliseconds left before the execution times out:

For each Lambda invocation, launch a separate thread running concurrently with our code;
Check the remaining time every few seconds;
When it is too close to zero (meaning, the function is about to timeout), it will preventively close the DB connection before the invocation is halted;

Illustrating the concept in simplified terms:

import logging
import time
import threading

# Dummy database objects
from database import (
    DatabaseConnection,
    DatabaseException,
    db_host,
)
import handler_executor


def handler(event, context):
    try:
        # Open a connection
        db_conn = DatabaseConn(**db_host)

        # Start a background thread to monitor risk of timeout
        timeout_thread = threading.Thread(
            target=monitor_timeout,
            args=(context, db_conn),
            daemon=True
        )

        response = handler_executor(event, context, db_conn)

    except DatabaseException as error:
        logging.exception(error)

        response = {'status': 500, 'error': type(error).__name__}

    finally:
        # Make sure the connection is finally closed, if still open
        if 'db_conn' in locals() and isinstance(db_conn, DatabaseConn) and \
                db_conn.is_open():
            db_conn.close()

        return response


def monitor_timeout(context, db_conn):
    # Check remaining time every two seconds
    while context.get_remaining_time_in_millis() >= 2500:
        time.sleep(2)

    # System will exit the loop if the remaining time < 2500 ms,
    # in which case we close the connection preventively
    if db_conn.is_open():
        db_conn.close()

Connection-less Databases

Perhaps mixing a serverless compute engine, such as Lambda, with server-based storage systems is not the best approach. It will always be difficult to have scalability well aligned.

To our rescue, there are good serverless, API-based storage systems nowadays. We don't need to manage servers, nor establish a connection before firing data queries. API Databases will accept queries right upfront through HTTP endpoints, for example.

AWS has been doing a great job in this area, having serverless databases in Graph, SQL and No-SQL flavors.

DynamoDB is the oldest of the options: a No-SQL, key-value store database. It had some scalability issues in the past that draw negative attention to it, but they are a thing of the past. Currently, one can easily scale Dynamo to 40,000 concurrent requests, which should be enough for most Lambda applications. With some clever DB architecture designing, DynamoDB can scale to virtual infinity. I’ve personally used it in many projects with great success.

For fans of SQL, Aurora Serverless now supports API queries through the Data API. Unfortunately, only MySQL is supported at the time of this writing, but the Aurora team is certainly working to release Postgres support in the near future.

In the Graph space, we have Neptune and Cloud Directory. Graphs aren't exactly the purpose of the last one but it can be used for this purpose.

All these API-based databases align well with the Lambda model: they are fully managed, scale rapidly to accommodate shifting demand, don’t require connection overhead.

Top comments (1)

Sharifuzzaman Nakib • Feb 28 '21

I am new to the serverless world and I might be wrong but I think opening and closing connections for each invocation is not always good because of the latency of creating a new connection every time. It will be a nightmare for a high number of concurrent requests.

The question you raised can be solved in a better way.

Your first question was.
...when and how connections are closed?

Since we are reusing the connection then they don't need to be closed. Right? Because closing means we need to open another connection later. But we want to reuse the opened connection as many times as possible, that's why we don't close the connection on the container.
But on the database, we can set a max-idle-time after which idle connections will be closed automatically or on some databases, we may create a corn job to do this. This way there will be no idle connection for too much time and the connections will be reused as more as possible.

Your second question was,
...how many connections are opened and where they’re being used?

It depends on the number of serverless functions running at that moment which requires a database connection.
However, we don't need to know exactly how many connections are opened. We just need to manage how many concurrent serverless functions may run at a time and set the limit of database connections accordingly.
We don't need to know where the connections are used. Each serverless function will reuse them as more as possible but if any connections are idle for too much time then the Database will close them.

This will be a huge performance optimization for situations where the number of concurrent requests which triggers serverless function is high since for each connection we don't need to create a new database connection.

Please explain if I am wrong.