There are three main factors that will determine what is the optimal compute solution for any given workload: application design, usage patterns and configuration settings. In a constant mission to improve performance, architectures can make use of a variety of compute solutions for different components, enabling different features. A mistake when selecting the wrong compute solution for your architecture can bring about lower performance efficiency, which you want to avoid at all costs. In this blog post, we will explore strategies for an adequate selection of your compute solution according to the AWS Well-Architected Framework.
Evaluating the available compute options
You should first focus on achieving a complete understanding of the performance characteristics of each compute solution available. You must understand the workings, advantages and tradeoffs of instances, containers and functions. From a performance stance, instances are most useful for long-running applications, especially those with state and long-running computational cycles. Functions, in contrast, can tend to be used for event-initiated, stateless applications requiring quick response times. Finally, containers enable an efficient use of an instance’s resources.
Your choice will depend on which option best matches your own application’s unique requirements. To make as informed of a decision as possible, you should be keeping track of which are the compute performance metrics that are most important to your workload. This data-driven approach, will allow you to identify where your compute solution might be constrained and explore in which ways you can improve it.
Understanding the available compute configuration options
Once you’re clear on your choice of compute solution, it’s key that you focus on understanding its possible configuration options and how each of them could complement your workload. Some examples of these options are instance family, sizes, features (GPU, I/O), function sizes, container instances, and single vs. multi-tenancy.
Amongst all the available configuration options, you should be selecting the ones that most optimally fit your performance requirements. AWS Nitro Systems can be used to enable full consumption of any compute and memory resources of the host hardware.
Collecting compute-related metrics
Recording and tracking the real-life utilization of your cloud resources is one of the best strategies to gain a great understanding of how your systems are performing. The data resulting from this can be used to more accurately determine resource requirements.
When it comes to collecting compute-related metrics, Amazon CloudWatch can be your best friend. It is able to collect metrics across all compute resources making up your environment. Within your workload, you should be using a combination of CloudWatch and other tools to track and record metrics of your entire system. You should be paying close attention to CPU usage levels, memory, disk I/O, and network to gain insight into any potential bottlenecks and have a clear picture of your utilization levels. This is key to understanding how your workload is performing and whether or not it’s using its resources efficiently. These metrics contribute to a data-driven approach that will result in a more optimized performance of your workload.
Determining the required configuration by right-sizing
In order to choose the most optimal configuration, you should analyze all the performance characteristics of your workload, and how each of these are related to memory, network, and CPU usage. This information will allow you to choose the best match for your workload. As an example, we can say that a memory-intensive workload, such as a database, might be best served by the r-family of instances. In contrast, a bursting workload might benefit more from an elastic container system.
Accurately determining which resources are necessary for your workload will lead to both performance of overall efficiency in your system. Memory-optimized instances are a good fit for systems that require more memory than CPU. Compute-optimized instances, in turn, are a good fit for components that do data processing that isn’t very memory intensive. Right sizing will not only enable your workload to perform at its peak efficiency while using only the required resources, but it will also ensure that you have made the correct decision when it comes to budget and cost efficiency.
Using the available elasticity of resources
Expanding or reducing your resources dynamically to meet changing demands is one of the main benefits of the cloud. As such, you should be using this flexibility to your advantage. Through a variety of mechanisms that put all those metrics you collected to good use, a workload in the cloud is able to automatically respond to changes utilizing the optimal set of resources.
Elasticity is the ability to match the supply of cloud resources to the existing demand at any given moment. All three compute solutions we’ve been discussing provide mechanisms for elasticity either in combination with automatic scaling or as a feature of the service itself. In your architecture, you should be using elasticity to ensure that you are capable of meeting performance requirements at any scale of use.
Metrics for scaling up or down elastic resources need to be validated against the type of workload being deployed. For example, you wouldn’t consider CPU usage levels if you’re deploying a video transcoding application, since that metric is expected to be 100%. In this case, you would probably find it more useful to measure the queue depth of transcoding jobs waiting to scale your instance types.
Finally, when it comes to elasticity, do not forget that scale down events are equally as important as scale up events, especially when considering costs and budgets. Scaling down resources when demand decreases is key to performance and overall efficiency. To ensure you’re prepared for this, it’s best practice to create test scenarios for scale down events. Good results ensure your workload is behaving as expected.