Uber的A/B测试的增效作用

Supercharging A/B Testing at Uber

Introduction

简介

“Immensely laborious calculations on inferior data may increase the yield from 95 to 100 percent. A gain of 5 percent, of perhaps a small total. A competent overhauling of the process of collection, or of the experimental design, may often increase the yield ten- or twelve-fold, for the same cost in time and labor. To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of. To utilize this kind of experience he must be induced to use his imagination, and to foresee in advance the difficulties and uncertainties with which, if they are not foreseen, his investigations will be beset.” (R. A. Fisher’s Presidential address to the 1st Indian Statistical Congress)

"在劣质数据上进行极其费力的计算,可能会将收益率从95%提高到100%。5%的收益,也许是一个很小的总数。对收集过程或实验设计进行一次合格的检修,往往可以使产量增加10倍或12倍,而所花费的时间和人力是一样的。在实验结束后咨询统计学家,往往只是要求他进行一次验尸。他也许能说出实验的死因。为了利用这种经验,必须诱导他使用他的想象力,并提前预见困难和不确定性,如果不预见这些困难和不确定性,他的调查就会受到困扰。"(R. A. Fisher在第一届印度统计学大会上的主席讲话)

While the statistical underpinnings of A/B testing are a century old, building a correct and reliable A/B testing platform and culture at a large scale is still a massive challenge. Mirroring Fisher’s observation above, carefully constructing the building blocks of an A/B platform and ensuring the data collected is correct is critical to guaranteeing correctness of experiment results, but it’s easy to get wrong. Uber went through a similar journey and this blog post describes why and how we rebuilt the A/B testing platform we had at Uber.

虽然A/B测试的统计基础已经有一个世纪之久,但在大规模建立一个正确可靠的A/B测试平台和文化仍然是一个巨大的挑战。与Fisher的上述观察相呼应,仔细构建A/B平台的构件并确保收集的数据正确,对于保证实验结果的正确性至关重要,但这很容易出错。Uber经历了一个类似的旅程,这篇博文描述了为什么以及我们如何重建我们在Uber的A/B测试平台。

Uber had an experimentation platform, called Morpheus, that was built 7+ years ago in the early days to do both feature flagging and A/B testing. Uber outgrew Morpheus significantly since then in terms of scale, users, use cases, etc.

Uber有一个实验平台,叫做Morpheus,是在7年多前的早期建立的,用来做功能标记和A/B测试。从那时起,Uber在规模、用...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-23 00:56
浙ICP备14020137号-1 $访客地图$