Amazon Web Services (AWS) is a compelling system that allows organizations to take advantage of a wide range of cloud services. AWS provides integrated visibility into infrastructure costs and lets users explore the usage of their resources in their application. Autoscaling, the automated process of increasing or decreasing your AWS managed resources provides one way to adjust a cloud system to better match load and save money.
However, AWS Auto Scaling can actually be just as harmful as beneficial. This is because AWS Auto Scaling is a complex process that requires the correct combination of configuration, testing, and monitoring to work correctly. Diligent application monitoring and frequent tuning of your auto scaling plan will help you reap the rewards that are possible. Here are some lessons we have learned to improve AWS autoscaling, so you don’t have to.
When considering a scaling strategy, how quickly your system scales matters. You will save yourself time by designing your own AMIs that incorporate the requisite libraries and software components for your specific server instance. This will reduce your deployment time by eliminating the wait for libraries and dependencies to download on the server you need provisioned.
By understanding metrics that control resources you can understand how to manage them for better performance. This is important because certain metrics like CPU utilization, impact your performance. These values will enable you to determine how you need to scale your resources in relation to your workload.
While there are many open source and commercial options for monitoring cloud systems, AWS CloudWatch is the default choice for AWS and is seamlessly integrated into the AWS ecosystem. It only takes a few clicks in the UI or a single CLI command to turn monitoring on or off. CouldWatch provides metrics about the behavior of the entire Auto Scaling group as well as the performance of individual instances. You can consistently track your metrics and can be used to determine when to scale up your Auto Scaling groups based on an analysis of your metrics.
Given the topic of this article, this seems like a given, yet many AWS users have the misperception that AWS Auto Scaling is too difficult to use. AWS has done a great job of making AWS Auto Scaling easy to get started with. And the small effort to turn on Auto Scaling will reward you with a more resilient system and reduced cloud costs.
Why do you need AWS Auto Scaling? Let’s break it down. Auto Scaling works by designing an Auto Scaling group that manages instances behind a load balancer. This is done so performance remains consistent by increasing capacity when load increases (to assure performance) and decreasing capacity when load decreases (to save on costs). AWS autoscaling can be used for any application whether stateful or stateless.
In order to configure your resources, you have to specify them in the AWS Auto Scaling Groups feature. Auto Scaling Groups define a resource maximum and minimum that define when resources will be launched or deleted dynamically. AWS enables allocating Autoscaling Groups to Elastic Load Balancers (ELBs) to make sure that newly created resources are seamlessly discovered and utilized.
Unless you are running a very generic application, a WordPress server for example, you should consider looking beyond the default metrics provided by CloudWatch. By taking the time to code a custom metric, you can then fine tune app performance based on what specifically matters to you. Even if you define specific metrics for your application, you can still use the default metrics as well. Custom metrics are created using Python and the Boto framework and are expressed as Boolean functions.
To accurately set your auto scaling policies, you need to define them in relation to your Availability Zones (AZ). Preplanning a percent-based scaling policy that takes into consideration the varied costs between different Regions and AZs can result in optimal performance for and reduced costs.
Don’t want to code? You can pair an SQS with a CloudWatch alert on queue length. CloudWatch can trigger a scaling event when queues exceed a predefined length.
While EC2 instances are the most common target of Auto Scaling, other AWS services can benefit from it as well. From an application function perspective, database services (e.g. AWS DynamoDB) are probably the second most popular service to scale. While there are slight differences in policies between Auto Scaling, if you’ve been creating auto scaling plans for your EC2 instances, scaling AWS DynamoDB or Amazon RDS storage will feel familiar.
Up to now we’ve discussed how to scale with resource limits that you have predefined, also known as dynamic scaling. AWS also supports predictive scaling based on past system performance based on the metrics you have configured. It forecasts and assures a minimum resource capacity is available based on the predictive model. Because predictive scaling assures a minimum resource capacity, you can combine it with dynamic scaling policies so that unexpected load increases are also smoothly managed.
As you can see, Auto Scaling is a powerful way to reduce costs while maximizing performance and AWS provides a number of useful tools to intelligently scale your environment in relation to demand. Although these platforms help you monitor and scale your cloud, you are never fully optimized as the external environment is not a constant. You need to constantly monitor and adjust your scaling policies to ensure that your system is always performing at its best.