DEV Community

John Preston for AWS Community Builders

Posted on

Reduce costs & Improve peak performance with AWS Application Autoscaling scheduled actions

TL;DR

Using AWS Application Autoscaling Scheduled Actions, you can prepare your resource compute capacity based on your needs, in combination to normal scaling rules.

Introduction

The cloud, a mystical place where you can pay for what you need and no more. One of my favorite principle with AWS is that, they give you all the tools, in the form of APIs and Services to be smart about spending your money.

You have probably seen many times already, customer demos showing how they handle peak time with fleets of machines when they are needed and scale-in to a minimum when usage is at its lowest.

Over the years, the features to enable customers to do these things have continuously gotten better. And many customers take great advantages of features such as SpotFleet and so on, as the majority of use-cases still require to run EC2 instances.

But what about what's not running on EC2?

Use-case

In the environments I work on, applications are deployed using AWS ECS on top of AWS Fargate. No more EC2 to manage, great, right? What about scaling DynamoDB? ElastiCache? Aurora?

Many of these (expensive) have a capability/feature to define scaling rules on one (or more) dimensions. For example, with AWS ECS, it's the service DesiredCount that will change (changes the number of containers). For DynamoDB tables, it's the table Read & Write capacity units, but that can also apply to all of the indexes of the table.

Each combination of resource and dimension constitute a Scaling Target.

How does it all work? It uses a service that is a lot less known: AWS Application Autoscaling. This service is responsible for monitoring Alarms & Metrics, and evaluate the value(s) against different rules you might have set in place.

Usually, the rules that are created will scale the dimension(s) in/out (or up/down depending on the resource) based on the current usage. Think of your usual CPU average across your EC2 Containers for example. You want the average to be maintained below 70%, and you gave a range of containers to have in the service (min and max).

And on a daily basis these rules are applied very accurately by the ever watching Application Autoscaling service.

But what about predictable workloads?

Scheduled Actions - the cron scheduler of autoscaling

All the resources dimensions will allow you to define a Min and a Max capacity. For example, min=1,max=10 Write Capacity Unit (WCU) & min=10,max=20 Read Capacity Unit (RCU)for a dynamo db table.

What scheduled actions allow you to do is to define a rule (your usual cron, a rate, i.e. every 2h, or at a specific point in time, but more on that one later) that will allow you to change the min/max dimensions of your resource

Yes, you read that right, it is not actually changing the value of say, the number of ECS containers. But changing the min and maximum range for that Scaling Target.

Now of course, if you change the minimum (say our RCU min = 15) where the current value might be at 10, then the RCU will now be 15 as the current value has to be in the range.

It works the other way around of course with the maximum. If you had 700 RCUs, and the scaling rule changes the range to max=100, the current value will get to 100.

If you have a website (news, shopping etc.) and you have put in place the analytics to understand when your user base connects to your site, you can then create scheduled actions which can predictably set your infrastructure up when you need it.

Where Application Autoscaling is very smart.

Your resource already has a scaling rule, say based on site usage, say average latency, in milliseconds. The way to maintain latency to a low value is by adding more ECS containers.

What's going to happen in the real world with scheduled rules?

Let's take the example of the news website that every day at 6AM starts to see its traffic go up (via the metric above). Say that metric reports 50ms at 5.45. There is a rule that says, above 100ms, double the capacity. That rule works regardless of the number of containers. We have defined we can have between 10 and 100 containers.

Without scheduled rules, we would have to wait for the latency to go above 100ms to trigger the scaling-out rule. The latency will then go down for a while. We have 20 containers now. Let's assume it goes up again, we now have 40 containers.

This will go on so long as the latency always goes up. When the latency goes down, we would remove containers, of course.

But customer satisfaction these days rely on a large part on having quite and responsive websites. So it is worth for the business, and they decide, that at a minimum, there should be 30 containers from 5.45 to 7AM.

We create a first scheduled action to change min = 30 at 5.45.
Then a second one, restoring the min = 10 at 7AM.

From 5.45AM to 7AM we are guaranteed that 30 containers are running. However, that doesn't mean that latency won't go up.

Let's assume it does, at 6.45AM, and so we now get 60 containers running.

Comes 7AM, the rule will only change the minimum, not the count. So at 7AM, we still have 60 containers, and possibly afterwards more, if the latency continues to increase.

What this guarantees us is, whenever after 7AM the latency goes down (say below 30ms) we don't need as many containers anymore. Before 7AM, we would have as little as 30 containers, no matter the value of the latency. After 7AM, if the latency allows for it, autoscaling will change the count down to the minimum, eventually 10.

Summary

For many services, the console will offer you to define scaling rules, based on a given input metric. However, there is no UI to configure scheduled rules, which in turn, makes this feature not a very well known one, yet one that is great to take advantage of. Equally, until recently (see my previous post), there was no way to create these rules via AWS CloudFormation.

But this is a feature available and you should definitely use it as soon as you can identify patterns that will allow you to save money ("I don't need these over night"), improve performance (repetitive/predictable behaviour), or both!

Head over to the API reference for more information.

Top comments (0)