In my job as a cloud architect working with large enterprises, there has always been a specific “moment of truth” when the customer realizes that cloud costs are something that needs to be monitored daily to avoid unpleasant surprises with the end of the month’s bill.Enterprises can take several steps toward understanding cloud costs and avoid surprises:
- establish control and governance.
- analyze possible savings.
- plan the required changes to reach identified savings.
The main goal of cloud governance is how to bring down the monthly billing or to keep it level, while absorbing the usual growth or the new projects. But what if we looked at cost optimization from a different angle? At Microsoft we can calculate the saved carbon emissions using tools, such as the Microsoft Sustainability Calculator, so, what if our efforts were drawn towards “carbon+cost efficiency” rather than simply cost saving?
Depending on the cloud maturity and model adopted by each customer, a few fundamental actions that are directly linked to cost (and carbon) saving can be taken.
Right-sizing is about understanding exactly what your applications need and not provisioning due to guessing. But wrong-sizing has a much higher cost than the monthly bill: trying to mirror the exact sizing of your on-prem infrastructure without applying right-sizing can lead to larger monthly bills, reduced capacity for other customers, and unnecessary electricity usage. Right-sizing is not just picking a VM size for a workload, but planning to change the sizing during the day according to its workload and the carbon efficiency of the region where it runs. For a PaaS environment, it can mean changing a service plan according to the time of day and/or expected usage. Sometimes companies can even try and set usage expectations by clearly giving an informed choice to their internal customers and end- users that their choice in an application can lead to less carbon impact, offering a green option for the applications where users can swap to less features or even a slower connection but knowing they are polluting less.
Reservations are typically a commercial discount on cloud services in exchange for a yearly or multi-yearly commitment. However, this saving does not necessarily equate to a green usage of the reserved resources: the cloud provider is happy for the commitment and the customer is happy for the discount, but reservations are not necessarily a greener option. To ensure your reservations are minimizing the electricity usage, you need to frequently monitor reservations and try to use them at the maximum utilization. Anything lower than 99% utilization must be right-sized.
Cleanup is the most difficult part of cost management, especially within large organizations and cloud deployments. When you have thousands of VMs and applications running, it’s quite difficult to scour the cloud tools for any inconsistency. For example, if you delete a VM and forget to delete its disks or IP address, those resources will continue to run, impacting your monthly bill as well as your carbon footprint. So, cleanup should be the first choice of carbon+cost efficiency. This is typically a task that can be automated and is included in most optimization tools, such as Azure Cost Management.
The last pillar of carbon+cost management is scheduling operations. This means being able to switch off and on again most of your servers/applications/services. With PaaS, it could be just changing the service tier to a less costly one during off-peak hours. Initially, many customers are hesitant to switch off applications on a schedule. However, in my experience, if customers test scheduling on a small number of applications, they quickly see the cost benefits and are open to scheduling more of their applications. The first step is to enforce any type of scheduling. It could be just turning the application off on the weekends or nights.
Once customers start understanding the cloud operations, what they should be thinking next is: “why keeping a valuable and costly resource switched-on even for just one hour, if it’s not used?” The carbon+cost efficient scheduler must consider:
- The time it takes to bring the application down. For example, if the whole reboot takes one hour, then the timeframes planning cannot be less than two hours.
- The real usage of the workloads. For example, if there is only one user logged into the portal at night, the user may consider not offering that service or application at night.
- Scheduling should not be just on-off option, but also include reducing PaaS service plans and tiers of resources, where possible.
Carbon+cost management should become a new standard that involves the entire company to be more efficient and greener. Application owners and developers should be involved in keeping infrastructure as close to carbon neutral as possible by starting with cleanup, scheduling and right sizing, and then to proceed to include the principles of sustainable software within the application itself. It’s important to look beyond the raw performance and financial costs of your infrastructure and start considering the ethical costs of how much carbon your infrastructure is emitting. Visit Azure’s cloud cost optimization page for cost optimization techniques and suggestions, as well as this blog periodically to preview our new tools and ideas.
originally published in the Microsoft Tech Blog