Capacity Planning with AWS EC2 and AutoScaling

A structured approach to select the optimum EC2 instance for your workload.

If you are working with AWS, you have probably seen there are 10s of EC2 instance types available for selection such as M Series, T Series, C Series & etc. Each of these types provides different features, supporting different workloads and also allows to select the size of the instance varying from Micro, Small, Medium, Large & etc.

You might wonder, why all these complexities and why not go for a large one? Instance selection comes with a cost and it actually doubles with the instance size. Since in most of the cases, EC2 instances are configured for Autoscaling and LoadBalancing, it is important select the optimal instance type and size as the scaling Unit. This article takes you through a structured approach to select the optimum EC2 instance for your workload.

AWS Best Practice: Stop guessing your capacity needs: Eliminate guessing your infrastructure capacity needs. When you make a capacity decision before Amazon Web Services you deploy a system, you might end up sitting on expensive idle resources or dealing with the performance implications of limited capacity. With cloud computing, these problems can go away. You can use as much or as little capacity as you need, and scale up and down automatically.

capacity planning with ec2 1



Lets take an Web Application as an example, which requires to autoscale with average Memory, CPU, Disk IO and Network requirements.

Step 1: Understanding the nature of your workload and basic capacity requirements

If you have already developed your application, you should be probably knowing the base capacity needs, such as how much Memory, CPU, Disk IO and Network Throughput it consumes on average. This helps to determine the EC2 instance Type. However, I would recommend to go with M Type(General Purpose) as a start unless your workload matches with a different instance type. To select the size of the instance, it is important to measure the Memory, CPU requirements.

For example, lets assume the web application on average consumes 6GiB of memory, then lets select m4.large which is having 8GiBof Memory.

Note: that there are two families of M Type instances available, namely M3 and M4. These refers to the generation of hardware used underneath. So M4 is likely having the newer generation of CPU, Memory & etc. compared to M3 instance type. Use the latest generation unless you are highly concerned about Cost or having Low to Moderate workload.

Step 2: Do series of Load Testing for the instance type with a Single EC2 instance.

After selecting the instance type in Step1 with assumptions, make sure you do load testing for the instance (Changing the instance size) and monitor the behavior of consuming the resources and the number of requests each instance size can serve. Lets assume we have obtained the following results after significant amount of Load Testing.

Step 3: Analyze the Load Testing Results

capacity planning with ec2 2

As you can see its reasonable to select m4.xlarge as the Unit for the instance in AutoScaling configuration where, m4.xlarge will be replicated for horizontal scaling.

However it is not that straightforward in some of the situations where the Cost Per Request might not provide a good differentiator. In those situations, also consider the behavior of your Web application considering,

  • How long it takes to boot up and start the web server.
  • Autoscaling threshold rules (e.g After 70% of memory consumption, start a new instance).
  • The nature of the workload (Frequent spikes, monotonous & etc.).

In these situations where predictability is difficult, based on your SLA requirements, provision EC2 instances slightly above the required to handle uncertain spikes in load modifying the AutoScaling configurations.


For more tech tutorials from Ashan, check out his blog on Medium