Salesforce的可扩展时间序列预测AI平台内部

By Ahad Shoaib & Kyle Gilson.

作者:Ahad Shoaib & Kyle Gilson。

Salesforce operates data centers worldwide, continuously monitoring infrastructure health metrics in real-time. Accurate demand forecasting is essential for provisioning infrastructure capacity. Insufficient capacity can lead to customer-impacting incidents, while excess capacity may cause budget overruns. Teams such as capacity planning, finance, and performance engineering depend on reliable forecasts to ensure cloud infrastructure scales effectively, maintaining high availability and cost efficiency.

Salesforce在全球范围内运营数据中心,实时监控基础设施健康度指标。准确的需求预测对于提供基础设施容量至关重要。容量不足可能导致影响客户的事件,而过剩的容量可能导致预算超支。容量规划、财务和性能工程等团队依赖可靠的预测,以确保云基础设施有效扩展,保持高可用性和成本效益。

In early 2023, the Infrastructure Data Science (InfraDS) team faced a challenge: expanding infrastructure health forecasting to cover all 100+ services at Salesforce, rather than the five critical services previously focused on. Drastically scaling the number of data scientists was clearly not the right answer. Instead, the team built a new configuration-driven Time Series Forecasting Platform designed to manage this increased scale.

2023年初,基础设施数据科学(InfraDS)团队面临一个挑战:将基础设施健康预测扩展到Salesforce的所有100多个服务,而不仅仅是之前关注的五个关键服务。显然,大规模增加数据科学家的数量并不是正确的答案。相反,团队构建了一个新的基于配置的时间序列预测平台,旨在管理这种增加的规模。

As a result, the platform’s capabilities have grown from five to over 70 forecasting use cases, generating millions of time series forecasts daily. Moreover, the time required to deploy new models has decreased from weeks to days. This expansion illustrates how Salesforce has successfully scaled its time series AI platform to meet the demands of its multi-cloud billion-dollar infrastructure.

结果,该平台的功能从五个扩展到超过70个预测用例,每天生成数百万个时间序列预测。此外,部署新模型所需的时间从几周减少到几天。这一扩展说明了Salesforce如何成功地将其时间序列AI平台扩展到满足其多云亿级基础设施的需求。

Forecasting at scale presents unique challenges due to the lack of a universal modeling approach. Each new use case compels data scientists to balance model accuracy, hierarchical coherence, awareness of concept drift, and resilience. For instance,...

开通本站会员,查看完整译文。

Accueil - Wiki
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-14 12:03
浙ICP备14020137号-1 $Carte des visiteurs$