DEV Community: Manoj Chaurasiya

How to Horizontally Scale Amazon ElastiCache Instance

Manoj Chaurasiya — Sat, 14 Sep 2024 07:16:09 +0000

Amazon ElastiCache provides in-memory data storage using Redis or Memcached, and scaling involves either vertical (instance size) or horizontal scaling (adding nodes or replicas). Horizontal scaling focuses on distributing the data and increasing capacity by adding more nodes to your ElastiCache cluster.

Redis Cluster Mode Enabled supports sharding out of the box, and this is the recommended way to horizontally scale Redis.

Lets start with horizontal scaling for ElastiCache instance, We can achieve scaling by updating Shard or Node, here will go for scaling with Nodes. ElastiCache cluster allows to create Dynamic scaling through AWS console but it will use only CPU Utilization metric.

If we have to apply dynamic scaling as per other metric then it can be created through AWS CLI.

Verify Metric Statistics

Before going further, let's cross check if we are getting proper statistics value for primary node, in our case we had considered primary node to set threshold value for auto scaling. we can select metric name as per uses of redis cluster.

aws cloudwatch get-metric-statistics \
  --namespace AWS/ElastiCache \
  --metric-name NetworkBytesOut \
  --dimensions Name=CacheClusterId,Value=primary-node-0001-001 \
  --start-time 2024-09-13T00:00:00Z \
  --end-time 2024-09-13T23:59:59Z \
  --period 300 \
  --statistics Average

Register Scalable Target

By registering the cluster as a scalable target, you can set the cluster to have a minimum of 2 nodes and a maximum of 5 nodes (number of node depends on cluster load). Based on the load and defined scaling policies (like Network Byte Out), Application Auto Scaling will automatically increase or decrease the number of nodes in your cluster within those limits.

  aws application-autoscaling register-scalable-target \
    --service-namespace elasticache \
    --resource-id replication-group/cluster-name-1 \
    --scalable-dimension elasticache:replication-group:Replicas \
    --min-capacity 1 \
    --max-capacity 5

Auto Scaling Config File

Here's an example configuration file for AWS ElastiCache autoscaling using Application Auto Scaling. This configuration defines the CloudWatch metric used for scaling (e.g., NetworkBytesOut) and details of the primary node and replicas. The auto scaling policy will leverage CloudWatch data points to adjust the number of nodes based on usage patterns. let's store this configuration in config.json

{
  "TargetValue": 7000000000,
  "CustomizedMetricSpecification":
  {
    "MetricName": "NetworkBytesOut",
    "Namespace": "AWS/ElastiCache",
    "Dimensions": [
      {"Name": "CacheClusterId","Value":"primary-node-0001-001"},
      {"Name": "CacheNodeId","Value": "0001"}
    ],
    "Statistic": "Average"
  }
}

Scaling Policy

The scaling policy defines the conditions under which the cluster should scale in (remove nodes) or scale out (add nodes). These conditions are usually based on metrics collected by Amazon CloudWatch (like CPU or network usage), in our case it is NetworkByteOut.

aws application-autoscaling put-scaling-policy \
    --policy-name elastic-cache-asg \
    --policy-type TargetTrackingScaling \
    --resource-id replication-group/cluster-name-1 \
    --service-namespace elasticache \
    --scalable-dimension elasticache:replication-group:Replicas \
    --target-tracking-scaling-policy-configuration file://config.json

After you configure the scaling policy, two CloudWatch alarms will automatically be created: HighAlarm and LowAlarm. These alarms monitor the selected metric (like CPU utilization) and trigger scaling actions when thresholds are crossed.

HighAlarm: This alarm triggers when the metric exceeds the upper threshold, resulting in a scale-out action to add more nodes to the cluster, improving performance.
LowAlarm: This alarm triggers when the metric drops below the lower threshold, resulting in a scale-in action to remove unnecessary nodes, optimizing resource usage.

The alarms ensure that the number of nodes in the ElastiCache cluster is adjusted dynamically, maintaining both performance and cost-efficiency. You can monitor these alarms in CloudWatch to confirm that scaling actions are taking place as expected.

Summary

In this post, we explored the implementation of Horizontal Auto Scaling for an ElastiCache Redis Cluster. One important point to note is that AWS ElastiCache supports autoscaling only for instance types m7g.large and above. This means that if you're using smaller instance types, such as t3 or m6g.medium, autoscaling will not be available. To take advantage of automatic scaling, ensure that your cluster is using at least the m7g.large instance type or higher.

How to Vertically Scale Your Amazon RDS Instance

Manoj Chaurasiya — Sat, 07 Sep 2024 10:25:34 +0000

Vertical scaling, often referred to as scaling up, involves increasing the capacity of a single database instance by adding more resources such as CPU, memory, and storage. This approach contrasts with horizontal scaling (or scaling out), which adds more instances to distribute the load.

AWS provides feature for horizontal scaling in Aurora RDS, but there no direct method to use auto scaling for RDS PostgreSQL, though vertical scaling can be achieve using Cloud Watch and Lambda.

Vertical scaling can be cost-effective in certain scenarios, particularly when compared to the alternative of horizontal scaling or maintaining an oversized infrastructure.

Cloud watch alarms

First we need to create Cloud watch alarm as per required metric, in our case we have used DBLoadCPU metric.

aws cloudwatch put-metric-alarm \
    --alarm-name db-rds-postgres-high-alarm \
    --metric-name DBLoadCPU \
    --namespace AWS/RDS \
    --statistic Maximum \
    --period 180 \
    --threshold 3 \
    --comparison-operator GreaterThanThreshold \
    --dimensions Name=DBInstanceIdentifier,Value=db-identifier-name \
    --evaluation-periods 1 \
    --alarm-actions arn:aws:lambda:***:*****:function:lambda-arn

we can create multiple alarms for same db instance, in case we have to scale down if metric data points are below threshold.

aws cloudwatch put-metric-alarm \
    --alarm-name db-rds-postgres-low-alarm \
    ...
    --period 180 \
    --threshold 1 \
    --comparison-operator LessThanThreshold \
    ...
    --alarm-actions arn:aws:lambda:***:*****:function:lambda-arn

Lambda function

We can use lambda function to update database instances. we can use JavaScript or Python to create auto scaling script.

import boto3
import json
# Initialize the RDS client
rds_client = boto3.client('rds')

def lambda_handler(event, context):

    db_instance_id = 'db-identifier-name'
    try:
        response = rds_client.describe_db_instances(DBInstanceIdentifier=db_instance_id)            

        db_status = response['DBInstances'][0]['DBInstanceStatus']

        if db_status != 'available':
            return { message: 'database is not in available state.' }

        reason_data = json.loads(event['alarmData']['state']['reasonData'])
        cpu_load = reason_data['recentDatapoints'][0]

        if cpu_load > 3:
            new_instance_class = 'db.t4g.large'
        else:
            new_instance_class = db_instance

        rds_client.modify_db_instance(
            DBInstanceIdentifier=db_instance_id,
            DBInstanceClass=new_instance_class,
            ApplyImmediately=True
        )

        return { 'message': 'instance modification started.' }

    except Exception as e:
        return { 'message': f"Failed to scale RDS instance {str(e)}" }

Resource-based Policy statement

We also have to add Resource-based Policy statement to Lambda function to get executed from Cloud Watch Events.
We can use AWS CLI to attach this policy.

aws lambda add-permission \
--function-name prod-postgres-rds-asg \
--statement-id prod-postgres-rds-high-policy \
--action 'lambda:InvokeFunction' \
--principal lambda.alarms.cloudwatch.amazonaws.com \
--source-account ******** \
--source-arn arn:aws:cloudwatch:***:******:alarm:db-rds-postgres-high-alarm

Lambda Role and Permissions

We will also have to add few more permission to get details or modify db instance by lambda, for this you have to edit the assigned the Role to Lambda function, and add below JSON object to lambda role.

 {
   "Effect": "Allow",
   "Action": ["rds:DescribeDBInstances", "rds:ModifyDBInstance"],
   "Resource": "arn:aws:rds:**:*****:db:db-prod-replica-1"
 }

That's it we have implemented Vertical scaling for RDS PostgreSQL database.
Now just to verify this implementation, you can hit multiple SQL queries, in our case we had called same API multiple time using below script.

const apiEndpoints = [
  'Replace with your APP API URL'
];

const token = `secrete-token`;
// Define the function to call all APIs
async function callApis() {
  try {
    const requests = apiEndpoints.map(url => axios.get(url, { headers: {"Authorization" : `Bearer ${token}`} }));
    const responses = await Promise.all(requests);

    responses.forEach((response, index) => {
      console.log(`Response from ${apiEndpoints[index]}`);
    });
  } catch (error) {
    console.error('Error calling APIs:', error);
  }
}
const intervalId = setInterval(callApis, 500)

Summary

In this post, we explored the implementation of Vertical Auto Scaling for an RDS PostgreSQL database. While this approach offers some advantages, it also has notable drawbacks, especially when dealing with sudden and unpredictable spikes in load. Vertical scaling can take approximately 10 minutes to update the instance size and become fully operational, which may not be fast enough to handle rapid fluctuations in demand.

Additionally, it's crucial that the database is configured for Multi-AZ (Availability Zone) support. Without this, there could be downtime during the instance resizing process, as it requires a reboot. Multi-AZ deployment helps mitigate this risk by ensuring that your database remains available during such changes.