The world’s most influential digital platforms, such as Amazon, Netflix, Facebook, LinkedIn, etc., constantly (machine) learn to offer better recommendations and advice. And it’s becoming clear that the best advice we now receive is more likely to come from intelligent machines than smart people [2,3]. “Recommender systems are the most important AI system of our time, the most important machine-learning pipeline today.,” Nvidia CEO and co-founder Jensen Huang said in Time Magazine in2021 [4]. “Recommender systems predict your needs and preferences from past interactions with you, your explicit preferences, and learned preferences using collaborative and content filtering methods.” Recommendation engines represent a global revolution in how choice can be personalized, packaged, presented, experienced, and understood.
With recommendation engines (we will use Recommendation Systems and Recommendation Engines interchangeably) in the consumer world, people can make impactful decisions based on the deeper insights and recommendations unseen before. However, although the recommendation engines are readily available in the consumer sector, the IT world is yet to have one that helps the user automate and optimize operations in Cloud, which is deemed complicated to do and requires certified cloud architects and engineers’ involvement.
The State of Digital Transformation
ProphetStor’s DataProphet Recommendation Engine Is the Core of the Optimization Solution
Visibility
Datadog, Prometheus, Sysdig, Splunk, Dynatrace, etc., are the leading solution providers of visibility in the Cloud. They offer a single pane of glass to present the data collected from multiple sources, visualize the operation data, and provide clues of operation trending information. This approach helps customers regain visibility and present the data scattered initially to multiple visibility solutions previously.
Cloud service providers also offer monitoring services and solutions to customers. Examples are Azure Monitoring, Cloud Watch (AWS), Green Lake (HP), Cloud Monitoring (Google), etc. Because the data collected are limited to the instances (such as the utilization of resources in the instances). They do not offer the metrics related to applications, virtualization platforms, logs and traces from various elements, and correlations for workload dynamics. They could not be effectively used in planning, issue isolations, and mitigations. In addition, these solutions do not offer recommendations for users to decide on the choices of resources for application resilience and performance acceleration.
Security
Efficiency
The ways to use resources in the Cloud and on-prem are very different. In the conventional on-prem infrastructure, it takes weeks, if not months, to procure, install, configure, and integrate the equipment before it can be operational. It is less likely that the resources can be adapted to the dynamic nature of the workload. In addition, the sunk cost of the on-prem equipment makes it less necessary to watch for the efficient usage of the resources.
On the other hand, when the applications are running in the Cloud, the cost of operation relies on the hour-by-hour, or even minute-by-minute, usage of the cloud resources. The cost of the operation can only be contained by changing the mindset from the CAPEX to the OPEX-based. As a result, in an enterprise, it is everyone’s responsibility to take care of the cost of operation and the visibility of the operation, and the adaptation of the resources to achieve the seemingly conflicting objectives of operation resilience and cost becomes complicated.
Many users adopted an over-provisioning strategy to resolve the resilience issues by allocating fixed and much larger resources than necessary, hoping to avoid service disruptions. It defeats the purpose of OPEX, and a simple spike in the workload might also cause the application to stop due to insufficient resources.
ProphetStor’s mission is to optimize the performance and cost of cloud operations in a machine-based, proactive manner. Federator.ai takes the operation metadata, analyzes and discovers the correlation and impacts, (machine) learns and builds operation models, and recommends the resource allocation and orchestration for the operation residence and efficiency.
During this process, we resolved many issues that deemed computational hard to get a viable solution:
- Conventional data science approaches look for correlations of ALL data sources. The approach inevitably becomes computation-intensive to process the time series and multiple data sources in multi-layers. ProphetStor’s DataProphet Recommendation Engine focused on multi-layer correlations from top to bottom, avoiding unnecessary calculations for finding relations of sources not needed in decision-making. The resulting solution saves more than 1,000 times of computation needed for an effective recommendation engine to develop actionable directives. For a detailed description of the multi-layer correlation, please see the whitepaper in [7].
- ProphetStor’s CrystalClear Time Series Analysis Engine is up to 23 times more efficient than the Facebook Prophet [5] and LinkedIn Greykite [6], with much better Mean Average Percentage Error (MAPE) as shown in [7], for use in the Multi-Layer correlation computations.
Combining the above points 1 and 2, Federator.ai has a cost-effective and computationally feasible solution to become a scalable solution in a full-stack insight and service or data mesh suitable solution that saves 1000s of computation time and makes the recommendation engine feasible.
With ProphetStor’s DataProphet Recommendation Engine, we can turn the passive management that waits for issues to happen into a proactive resource orchestrator that addresses both the resilience and the cost of operation in the next phase of the IT journey to Cloud Native.
The DataProphet Recommendation Engine is an innovative way of handling the so-called North-South Insight (among layers of the IT infrastructure from application down to server/Cloud instance) and East-West Dynamics (how the applications/microservices react to the workload dynamics). In addition, we further consider time-based analysis to have the foresight of the dynamics for the future. As a result, this is a truly comprehensive platform that serves as a foundation for working with other market solutions. Again, without computation savings from the design, the full-scale insight into the past, present, and the future could not be economical enough to be helpful. Federator.ai, with DataProphet Recommendation Engine at its core, provides recommendations for operations.
Here are some examples of application scenarios:
- Working with monitoring services/solutions, such as Datadog, Sysdig, Prometheus, Azure Monitoring, Cloud Watch (AWS), Green Lake (HP), Cloud Monitoring (Google), etc.: The DataProphet interacts with the data sources through a standard data adaptation layer via published APIs. DataProphet uses retained data from the monitoring solutions by fetching months of historical data without waiting for sufficient operation data to be collected. This dramatically shortens the time-to-value from days to hours. Federator.ai analyzes the collected metadata into intelligent operation recommendations to accelerate applications, enable planning, remove uncertainty, answer “what-if” questions for workload accommodations, and optimize the operation cost in an on-prem, hybrid cloud, or MultiCloud environment. In other words, Federator.ai can effectively turn visibility into continuous optimization in operations.