机器学习在线服务和模型的持续集成和部署

Continuous Integration and Deployment for Machine Learning Online Serving and Models

Introduction

简介

At Uber, we have witnessed a significant increase in machine learning adoption across various organizations and use-cases over the last few years. Our machine learning models are empowering a better customer experience, helping prevent safety incidents, and ensuring market efficiency, all in real time. The figure above is a high level view of CI/CD for models and service binary.

在Uber,我们见证了过去几年中,机器学习在各种组织和使用案例中的应用显著增加。我们的机器学习模型正在赋予更好的客户体验,帮助防止安全事故,并确保市场效率,所有这些都是实时的。上图是模型和服务二进制的CI/CD的高层次视图。

One thing to note is we have continuous integration (CI)/continuous deployment (CD) for models and services, as shown above in Figure 1. We arrived at this solution after several iterations to address some of MLOps challenges, as the number of models trained and deployed grew rapidly. The first challenge was to support a large volume of model deployments on a daily basis, while keeping the Real-time Prediction Service highly available. We will discuss our solution in the Model Deployment section.

需要注意的一点是,我们对模型和服务进行了持续集成(CI)/持续部署(CD),如上图1所示。由于训练和部署的模型数量迅速增长,我们经过几次迭代后得出了这个解决方案,以解决MLOps的一些挑战。第一个挑战是如何支持每天大量的模型部署,同时保持实时预测服务的高可用性。我们将在模型部署部分讨论我们的解决方案。

The memory footprint associated with a Real-time Prediction Service instance grows as newly retrained models get deployed, which presented our second challenge. A large number of models also increases the amount of time required for model downloading and loading during instance (re)start. We observed a great portion of older models received no traffic as newer models were deployed. We will discuss our solution in the Model Auto-Retirement section.

随着新的重新训练的模型的部署,与实时预测服务实例相关的内存足迹也在增长,这就是我们的第二个挑战。大量的模型也增加了实例(重新)启动时下载和加载模型所需的时间。我们观察到,随着新模型的部署,很大一部分旧模型没有收到流量。我们将在模型自动退役部分讨论我们的解决方案。

The third challenge is associated with model rollout strategies. Machine learning engineers may choose to roll out models through different stages, such as shadow, testing, or experimentation. We observed some common patterns in model rollout strategies and decided to incorporate them into the Real...

开通本站会员,查看完整译文。

inicio - Wiki
Copyright © 2011-2025 iteam. Current version is 2.139.2. UTC+08:00, 2025-01-25 13:14
浙ICP备14020137号-1 $mapa de visitantes$