How to Vertically Scale Your Amazon RDS Instance

#aws #postgres #autoscaling #lambda

Vertical scaling, often referred to as scaling up, involves increasing the capacity of a single database instance by adding more resources such as CPU, memory, and storage. This approach contrasts with horizontal scaling (or scaling out), which adds more instances to distribute the load.

AWS provides feature for horizontal scaling in Aurora RDS, but there no direct method to use auto scaling for RDS PostgreSQL, though vertical scaling can be achieve using Cloud Watch and Lambda.

Vertical scaling can be cost-effective in certain scenarios, particularly when compared to the alternative of horizontal scaling or maintaining an oversized infrastructure.

Cloud watch alarms

First we need to create Cloud watch alarm as per required metric, in our case we have used DBLoadCPU metric.

aws cloudwatch put-metric-alarm \
    --alarm-name db-rds-postgres-high-alarm \
    --metric-name DBLoadCPU \
    --namespace AWS/RDS \
    --statistic Maximum \
    --period 180 \
    --threshold 3 \
    --comparison-operator GreaterThanThreshold \
    --dimensions Name=DBInstanceIdentifier,Value=db-identifier-name \
    --evaluation-periods 1 \
    --alarm-actions arn:aws:lambda:***:*****:function:lambda-arn

we can create multiple alarms for same db instance, in case we have to scale down if metric data points are below threshold.

aws cloudwatch put-metric-alarm \
    --alarm-name db-rds-postgres-low-alarm \
    ...
    --period 180 \
    --threshold 1 \
    --comparison-operator LessThanThreshold \
    ...
    --alarm-actions arn:aws:lambda:***:*****:function:lambda-arn

Lambda function

We can use lambda function to update database instances. we can use JavaScript or Python to create auto scaling script.

import boto3
import json
# Initialize the RDS client
rds_client = boto3.client('rds')

def lambda_handler(event, context):

    db_instance_id = 'db-identifier-name'
    try:
        response = rds_client.describe_db_instances(DBInstanceIdentifier=db_instance_id)            

        db_status = response['DBInstances'][0]['DBInstanceStatus']

        if db_status != 'available':
            return { message: 'database is not in available state.' }

        reason_data = json.loads(event['alarmData']['state']['reasonData'])
        cpu_load = reason_data['recentDatapoints'][0]

        if cpu_load > 3:
            new_instance_class = 'db.t4g.large'
        else:
            new_instance_class = db_instance

        rds_client.modify_db_instance(
            DBInstanceIdentifier=db_instance_id,
            DBInstanceClass=new_instance_class,
            ApplyImmediately=True
        )

        return { 'message': 'instance modification started.' }

    except Exception as e:
        return { 'message': f"Failed to scale RDS instance {str(e)}" }

Resource-based Policy statement

We also have to add Resource-based Policy statement to Lambda function to get executed from Cloud Watch Events.
We can use AWS CLI to attach this policy.

aws lambda add-permission \
--function-name prod-postgres-rds-asg \
--statement-id prod-postgres-rds-high-policy \
--action 'lambda:InvokeFunction' \
--principal lambda.alarms.cloudwatch.amazonaws.com \
--source-account ******** \
--source-arn arn:aws:cloudwatch:***:******:alarm:db-rds-postgres-high-alarm

Lambda Role and Permissions

We will also have to add few more permission to get details or modify db instance by lambda, for this you have to edit the assigned the Role to Lambda function, and add below JSON object to lambda role.

 {
   "Effect": "Allow",
   "Action": ["rds:DescribeDBInstances", "rds:ModifyDBInstance"],
   "Resource": "arn:aws:rds:**:*****:db:db-prod-replica-1"
 }

That's it we have implemented Vertical scaling for RDS PostgreSQL database.
Now just to verify this implementation, you can hit multiple SQL queries, in our case we had called same API multiple time using below script.

const apiEndpoints = [
  'Replace with your APP API URL'
];

const token = `secrete-token`;
// Define the function to call all APIs
async function callApis() {
  try {
    const requests = apiEndpoints.map(url => axios.get(url, { headers: {"Authorization" : `Bearer ${token}`} }));
    const responses = await Promise.all(requests);

    responses.forEach((response, index) => {
      console.log(`Response from ${apiEndpoints[index]}`);
    });
  } catch (error) {
    console.error('Error calling APIs:', error);
  }
}
const intervalId = setInterval(callApis, 500)

Summary

In this post, we explored the implementation of Vertical Auto Scaling for an RDS PostgreSQL database. While this approach offers some advantages, it also has notable drawbacks, especially when dealing with sudden and unpredictable spikes in load. Vertical scaling can take approximately 10 minutes to update the instance size and become fully operational, which may not be fast enough to handle rapid fluctuations in demand.

Additionally, it's crucial that the database is configured for Multi-AZ (Availability Zone) support. Without this, there could be downtime during the instance resizing process, as it requires a reboot. Multi-AZ deployment helps mitigate this risk by ensuring that your database remains available during such changes.

DEV Community