模型卓越分数:提升规模化机器学习系统质量的框架

Machine learning (ML) is integral to Uber’s operational strategy, influencing a range of business-critical decisions. This includes predicting rider demand, identifying fraudulent activities, enhancing Uber Eats’ food discovery and recommendations, and refining estimated times of arrival (ETAs). Despite the growing ubiquity and impact of ML in various organizations, evaluating model “quality” remains a multifaceted challenge. A notable distinction exists between online and offline model assessment. Many teams primarily focus on offline evaluation, occasionally complementing this with short-term online analysis. However, as models become more integrated and automated in production environments, continuous monitoring and measurement are often overlooked.

机器学习(ML)是Uber运营策略的重要组成部分,影响着一系列关键业务决策。这包括预测乘客需求,识别欺诈活动,改进Uber Eats的食品发现和推荐,以及优化预计到达时间(ETA)。尽管ML在各个组织中的普及度和影响力不断增长,但评估模型的“质量”仍然是一个多方面的挑战。在线和离线模型评估之间存在明显的区别。许多团队主要关注离线评估,偶尔会通过短期在线分析来补充。然而,随着模型在生产环境中变得更加集成和自动化,持续的监控和测量往往被忽视。

Commonly, teams concentrate on performance metrics such as AUC and RMSE, while neglecting other vital factors like the timeliness of training data, model reproducibility, and automated retraining. This lack of comprehensive quality assessment leads to limited visibility for ML engineers and data scientists regarding the various quality dimensions at different stages of a model’s lifecycle. Moreover, this gap hinders organizational leaders from making fully informed decisions regarding the quality and impact of ML projects.

通常,团队会集中关注性能指标,如AUC和RMSE,而忽视其他重要因素,如训练数据的及时性、模型可复现性和自动化重新训练。这种缺乏全面的质量评估导致机器学习工程师和数据科学家在模型生命周期的不同阶段对各种质量维度的可见性有限。此外,这种差距还妨碍了组织领导对机器学习项目的质量和影响做出充分的知情决策。

To bridge this gap, we propose defining distinct dimensions for each phase of a model’s lifecycle, encompassing prototyping, training, deployment, and prediction (See Figure 1). By integrating the Service Level Agreement (SLA) concept, we aim to establish a standard for measuring and ensuring ML model quality. Additionally, we are developing a unified system to track and visualiz...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.125.1. UTC+08:00, 2024-05-17 15:00
浙ICP备14020137号-1 $访客地图$