Supercharging A/B Testing at Uber

出处：www.uber.com

存档：存档

译文：中文

摘要

In early 2020, we took a deeper look at this ecosystem. We discovered that a large percentage of the experiments had fatal problems and often needed to be rerun. Obtaining high-quality results required an expert-level understanding of experimentation and statistics, and an inordinate amount of toil (custom analysis, pipelining, etc.). This slowed down decision-making, and re-running poorly conducted experiments was common.

After assessing the customer problems and the internals of Morpheus we concluded that the core abstractions supported only a very narrow set of experiment designs correctly, and even minute deviations from such designs resulted in incomparable cohorts of users in control and treatment and compromised experiment results. To give a very simple example, while gradually rolling out an experiment that was split 30/70 between control and treatment, due to peculiarities of rollout and treatment assignment logic it would be ok to roll it out to 10% of users but not to 5%. Furthermore, the system was not able to support advanced experiment configurations needed to support Uber’s diverse use cases, or other advanced functionality at scale such as monitoring/rolling back experiments that were negatively impacting business metrics. So we decided to build a new platform from scratch with correct abstractions.

阅读原文

xiaozi 于 2022-07-22 分享

8195

关联话题： #Uber

欢迎在评论区写下你对这篇文章的看法。

Supercharging A/B Testing at Uber

Supercharging A/B Testing at Uber

摘要

评论

文库