EFlow - 争夺数百万的ML流量
Salesforce Einstein operates many machine learning applications that cater to a variety of use cases inside Salesforce — vision and language, but also classic machine learning (ML) approaches using tabular data to classify customer cases, score leads, and more. We also offer many of these ML solutions and applications to our customers. Each one of our ML applications needs to work reliably for potentially any of our business customers, so we’re faced with the challenging problem of scaling to tens or in some cases hundreds of thousands of independent machine learning model lifecycles, with at least one model per customer. Furthermore, we have strong isolation requirements for our customer’s data, meaning the lifecycle of both the data and the models depending on that data, needs to be independent to meet compliance and legal requirements. We call this problem the multi-tenancy scaling problem. It defines the fundamental scale challenge we face across our stack, first and foremost in our ML operations and infrastructure, for both offline and online environments and services, but also in the lower level model architecture design and choice of machine learning approaches.
Salesforce Einstein运营着许多机器学习应用,这些应用迎合了Salesforce内部的各种用例--视觉和语言,但也有经典的机器学习(ML)方法,使用表格数据对客户案例进行分类,对线索进行评分,等等。我们也向客户提供许多这样的ML解决方案和应用。我们的每一个ML应用都需要为我们的任何商业客户可靠地工作,因此我们面临着一个具有挑战性的问题,即扩展到数万或在某些情况下数十万独立的机器学习模型生命周期,每个客户至少有一个模型。此外,我们对客户的数据有很强的隔离要求,这意味着数据和依赖这些数据的模型的生命周期都需要独立,以满足合规和法律要求。我们把这个问题称为多租户扩展问题。它定义了我们在整个堆栈中面临的基本规模挑战,首先是我们的ML运营和基础设施,包括离线和在线环境和服务,但也包括低级别的模型架构设计和机器学习方法的选择。
Multi-tenancy and EFlow
多租户和EFlow
The multi-tenancy problem and the need for independent models, at least one per customer, translates operationally directly to managing hundreds of thousands of ML flows reliably with minimal disruption and overhead. Over several years we managed, operated and evaluated several obvious options and tools prevalent in workflow management, including Airflow, Argo, and Azkaban. While operating on them, we realized the hard way, that these historically p...