Cloud Cost Management with ML-based Resource Predictions (Part I)

blog image

Cloud Cost Management with ML-based Resource Predictions

As enterprises go through digital transformation to improve efficiency, increase values and innovation, the adoption of cloud and Kubernetes has accelerated. However, there are many challenges. One of the major concerns of moving applications and services to the cloud is the cost. Managing cloud costs has become a challenging task for businesses and organizations. Gartner published a report on this topic and proposed a well-defined framework for managing and optimizing the costs of public cloud services [1].

Among the key findings in this report is that most organizations are not prepared to profit from the savings opportunity of efficient use of cloud services and are likely to overspend. The report lists a series of recommendations and a Guidance Framework to manage cloud spending on an ongoing basis. Five distinct areas are defined by the framework: Plan, Track, Reduce, Optimize and Evolve. It provides a logical flow on developing and implementing capabilities in managing cloud spending.

ProphetStor’s utilizes machine-learning technologies as a unique approach to help organizations solve the cloud overspending problem. In this article, we demonstrate how’s solution implements many of the recommendations suggested by Gartner’s Guidance Framework that can benefit customers using SUSE Rancher-managed clusters. Mainly, with the ability to forecast the resource usages based on the past operational metrics, makes use of the predicted resource usage in cost/budget planning, tracking of both past usages and predicted future usages, reducing the cost of applications by right-sizing the resource allocation and optimizing the performance and cost with intelligent horizontal pod autoscaling.

Planning for Resource Capacity and Budget

As stated in the Guidance Framework, any cost management task would require organizations to have budget planning and consumption forecast as accurately as possible. This applies to deploying new applications in the public cloud and migrating existing applications from on-premises into the public cloud. Utilizing resource utilization metrics collected via metrics services such as Prometheus, Datadog, or Sysdig,’s AI engine makes predictions of future resource usages and the projected cost for an entire cluster or an individual namespace/application.
At the beginning of any planning and modeling, the forecast of consumption is based on assumptions, and as a result, it probably won’t be a good match for the actual bill. With, the machine learning-based algorithm builds a more accurate consumption forecast model suited for continuous planning and budgeting activities on an ongoing basis.

Automatic forecasting in CI/CD process is also proposed in the Guidance Framework. As many organizations adopt a “shift left” strategy that places the onus for quality, reliability, and uptime with application delivery teams, such teams have increased expectations to forecast costs, optimize resources, and implement continuous optimization. This requires integrating the forecasting into an automated CI/CD process. machine learning-based application resource forecasting can be easily integrated into automated CI/CD processes. Through open APIs, it is easy to obtain the forecast and recommendations for application resources integrated into any CI/CD process. Furthermore, CI/CD integration sample scripts for any application are available directly from the GUI, making integration even more straightforward.

Tracking Resource Usage and Cost Trends

When the budget is established, and applications are deployed to a public cloud environment, continuous tracking and maintaining visibility of the cloud spending is essential. Many organizations save money by simply gaining visibility into who is spending money and for which project. However, the visibility should not be limited to just the up-to-date cloud spending but should also include the cost trends based on the predicted workload of the application. There are two primary benefits of knowing the predicted workload of an application. First, users can learn the projected cost and if the cost is still on par with the budget. Second, from the predicted workload, users can know if an application is under-provisioned with performance risk or over-provisioned such that there is wasted spend.
The tracking of resource usage and cost should not be limited to applications. It is crucial for an organization to track cloud spend based on projects, especially since they share common cloud resources as suggested in the Guidance Framework. Understanding the cost on a per namespace basis allows users to track the cost and its projection for a specific project. This also provides a way for organizations to quickly implement chargeback and show-back strategies, both at the cluster level and the application level.

(Part II covers continuous rightsizing along with performance and cost optimization.) and the SUSE Rancher Apps and Marketplace

Managing cloud costs is an essential and challenging task for organizations using cloud services to drive their business with greater efficiency. In partnership with SUSE, Prophetstor’s provides an effective cloud cost management solution for customers running applications on SUSE Rancher-managed clusters.’s ML-based cost management implements some of the most valuable recommendations from the Guidance Framework and brings tremendous values to users adopting this framework. ProphetStor is currently available on SUSE Rancher Apps and Marketplace and is fully supported on both a SUSE Rancher instance as well as a Rancher open source project deployment.
For more information on, please visit
This blog was originally published on SUSE Blog.

Ming Sheu

EVP of Products


[1] Gartner Research: How to Manage and Optimize Costs of Public Cloud IaaS and PaaS  by Marco Meinardi and Traverse Clayton