启用Uber规模的离线推理功能
At Uber we use data from user support interactions to identify gaps in our products and create better, more delightful experiences for our users. Support interactions with customers include information about broken product experiences, any technical or operational issues faced, and even their general sentiment towards the product and company. Understanding the root cause of a broken product experience requires additional context, such as details of the trip or the order. For example, the root cause for a customer issue about a delayed order might be due to a bad route given to the courier. In this case, we would want to attribute the poor customer experience to courier routing errors so that the Maps team can fix the same.
在Uber,我们使用来自用户支持互动的数据来确定我们产品中的差距,并为我们的用户创造更好、更愉快的体验。与客户的支持互动包括关于产品的故障体验、面临的任何技术或操作问题,甚至他们对产品和公司的一般情绪。理解产品故障体验的根本原因需要额外的背景,例如旅行或订单的细节。例如,客户关于订单延迟的问题的根本原因可能是由于给快递员的路线不好。在这种情况下,我们希望将糟糕的客户体验归因于快递员的路线错误,以便地图团队能够解决这个问题。
Initially, we had manual agents review a statistically significant sample from resolved support interactions. They would manually verify and label the resolved support issues and assign root cause attribution to different categories and subcategories of issue types. We wanted to build a proof-of-concept (POC) that automates and scales this manual process by applying ML and NLP algorithms on the semi-structured or unstructured data from all support interactions, on a daily basis.
最初,我们让人工代理审查已解决的支持互动中的一个统计意义上的样本。他们将手动验证和标记已解决的支持问题,并将根本原因归结为不同类别和子类别的问题类型。我们想建立一个概念验证(POC),通过每天对所有支持互动的半结构化或非结构化数据应用ML和NLP算法,使这个人工过程自动化和规模化。
This article describes the approach we took and the end-to-end design of our data processing and ML pipelines for our POC, which optimized the ease of building and maintaining such high scale offline inference workflows by engineers and data scientists on the team.
本文介绍了我们采取的方法,以及我们POC的数据处理和ML管道的端到端设计,这优化了团队中的工程师和数据科学家构建和维护这种高规模的离线推理工作流程的便利性。
Problem
问题
For our POC, we wanted to build out a reporting dashboard powered by offline ML inferences to identify the root cause...