Amazon Web Services

Lessons Learned About AWS Autoscaling

March 8, 2023

Amazon Web Services (AWS) is a compelling system that allows organizations to take advantage of a wide range of cloud services. AWS provides integrated visibility into infrastructure costs and lets users explore the usage of their resources in their application. Autoscaling, the automated process of increasing or decreasing your AWS managed resources provides one way to adjust a cloud system to better match load and save money.

However, AWS Auto Scaling can actually be just as harmful as beneficial. This is because AWS Auto Scaling is a complex process that requires the correct combination of configuration, testing, and monitoring to work correctly. Diligent application monitoring and frequent tuning of your auto scaling plan will help you reap the rewards that are possible. Here are some lessons we have learned to improve AWS autoscaling, so you don’t have to.

Design Custom Amazon Machine Images (AMIs)

When considering a scaling strategy, how quickly your system scales matters. You will save yourself time by designing your own AMIs that incorporate the requisite libraries and software components for your specific server instance. This will reduce your deployment time by eliminating the wait for libraries and dependencies to download on the server you need provisioned.

Define the metrics that affect your application performance

By understanding metrics that control resources you can understand how to manage them for better performance. This is important because certain metrics like CPU utilization, impact your performance. These values will enable you to determine how you need to scale your resources in relation to your workload.

While there are many open source and commercial options for monitoring cloud systems, AWS CloudWatch is the default choice for AWS and is seamlessly integrated into the AWS ecosystem. It only takes a few clicks in the UI or a single CLI command to turn monitoring on or off. CouldWatch provides metrics about the behavior of the entire Auto Scaling group as well as the performance of individual instances. You can consistently track your metrics and can be used to determine when to scale up your Auto Scaling groups based on an analysis of your metrics.

Use AWS Auto Scaling

Given the topic of this article, this seems like a given, yet many AWS users have the misperception that AWS Auto Scaling is too difficult to use. AWS has done a great job of making AWS Auto Scaling easy to get started with. And the small effort to turn on Auto Scaling will reward you with a more resilient system and reduced cloud costs.

Why do you need AWS Auto Scaling? Let’s break it down. Auto Scaling works by designing an Auto Scaling group that manages instances behind a load balancer. This is done so performance remains consistent by increasing capacity when load increases (to assure performance) and decreasing capacity when load decreases (to save on costs). AWS autoscaling can be used for any application whether stateful or stateless.

Learning how Auto Scaling Groups function with Dynamic Auto Scaling

In order to configure your resources, you have to specify them in the AWS Auto Scaling Groups feature. Auto Scaling Groups define a resource maximum and minimum that define when resources will be launched or deleted dynamically. AWS enables allocating Autoscaling Groups to Elastic Load Balancers (ELBs) to make sure that newly created resources are seamlessly discovered and utilized.

Create Custom metrics to tune your Auto Scaling

Unless you are running a very generic application, a WordPress server for example, you should consider looking beyond the default metrics provided by CloudWatch. By taking the time to code a custom metric, you can then fine tune app performance based on what specifically matters to you. Even if you define specific metrics for your application, you can still use the default metrics as well. Custom metrics are created using Python and the Boto framework and are expressed as Boolean functions.

To accurately set your auto scaling policies, you need to define them in relation to your Availability Zones (AZ). Preplanning a percent-based scaling policy that takes into consideration the varied costs between different Regions and AZs can result in optimal performance for and reduced costs.

Integrate Simple Queue Services (SQS)

Don’t want to code? You can pair an SQS with a CloudWatch alert on queue length. CloudWatch can trigger a scaling event when queues exceed a predefined length.

Scaling up AWS DynamoDB

While EC2 instances are the most common target of Auto Scaling, other AWS services can benefit from it as well. From an application function perspective, database services (e.g. AWS DynamoDB) are probably the second most popular service to scale. While there are slight differences in policies between Auto Scaling, if you’ve been creating auto scaling plans for your EC2 instances, scaling AWS DynamoDB or Amazon RDS storage will feel familiar.

Use Predictive Scaling

Up to now we’ve discussed how to scale with resource limits that you have predefined, also known as dynamic scaling. AWS also supports predictive scaling based on past system performance based on the metrics you have configured. It forecasts and assures a minimum resource capacity is available based on the predictive model. Because predictive scaling assures a minimum resource capacity, you can combine it with dynamic scaling policies so that unexpected load increases are also smoothly managed.


As you can see, Auto Scaling is a powerful way to reduce costs while maximizing performance and AWS provides a number of useful tools to intelligently scale your environment in relation to demand. Although these platforms help you monitor and scale your cloud, you are never fully optimized as the external environment is not a constant. You need to constantly monitor and adjust your scaling policies to ensure that your system is always performing at its best.

© All rights reserved —