TL;DR style notes from articles I read today.
- Scaling speed is limited. If the ratio of your actual metric value to target metric value is low, the maximum magnitude of a scale out event will get significantly limited.
- Short cooldown periods can cause over-scaling or under-scaling because a scaling event may trigger before a previous scaling event has concluded.
- Target tracking autoscaling works best in situations where at least one ECS service or CloudWatch metric is directly affected by the running task count, the metrics are bounded, or relatively stable and predictable.
- The best way to find the right autoscaling strategy is to test it in your specific environment and against your specific load patterns.
Full post here, 8 mins read
- Do you think auto-scaling is easy? No, it is not. Maintaining the templates & scripts required for the auto-scaling process to work well takes a significant time investment.
- It is a myth that elastic scaling is more common than fixed-size auto-scaling. Most useful aspects of auto-scaling focus on high-availability and redundancy instead of elastic load scaling.
- A common misconception is that load-based auto-scaling is appropriate in every environment. Some cloud deployments will be better off without auto-scaling or on a limited basis.
- There's a delicate balance between perfect base images and lengthy configurations that need to be run in an auto-scale event. This depends on how fast the instance needs to be spun up, how often auto-scaling events happen, the average life of an instance, etc.
Full post here, 6 mins read
- Transactional flows are an ideal use case for auto-scaling because of unused compute capacity during non-peak hours.
- When you need to detect any scaling-worthy events, AWS components like Step Functions Metrics and Cloudwatch Alarms come in handy.
- Support a scale-down cool-off time to prevent two consecutive scale-down actions within a certain amount of time.
- Guard your system against any malicious, delayed, or duplicated scaling notifications by validating incoming scaling signals.
- Review historical statistics for scale-down alarms so that they’re less susceptible to triggers and never occur during peak hours.
- For a safe rollout, increment steps till you gradually reach the ideal minimal instance count.
Full post here, 5 mins read