如何通过交错设计推动DoorDash的实验边界

We’ve traditionally relied on A/B testing at DoorDash to guide our decisions. However, when precision and speed are crucial, this method often falls short. The limited sensitivity of A/B tests—their ability to detect real differences between groups—can result in users being exposed to suboptimal changes for extended periods. For example, in our search and ranking use cases, achieving reliable results often required several weeks of testing across all traffic, which not only delays the introduction of new ideas but also prolongs the negative impact of underperforming changes.

在DoorDash,我们传统上依靠A/B测试来指导我们的决策。然而,当精确性和速度至关重要时,这种方法往往不够。A/B测试的有限灵敏度——它们检测组之间真实差异的能力——可能导致用户在较长时间内接触到次优变化。例如,在我们的搜索和排名用例中,要获得可靠的结果通常需要在所有流量上进行数周的测试,这不仅延迟了新想法的引入,还延长了低效变化的负面影响。

Interleaving design offers significantly higher sensitivity – more than 100 times that of traditional methods – by allowing multiple conditions to be tested simultaneously on the same user as shown in Figure 1. Interleaving design generally provides a more accurate and granular understanding of user preferences, allowing us to iterate more quickly and with higher confidence.

交错设计提供了显著更高的灵敏度——比传统方法高出100倍以上——通过允许在同一用户上同时测试多个条件,如图1所示。交错设计通常提供了对用户偏好的更准确和细致的理解,使我们能够更快地迭代并更有信心。

Figure 1: In a traditional A/B design, users see only one treatment variant. In an interleaving design, users are exposed to multiple treatments simultaneously, which significantly improves test sensitivity.

图1:在传统的A/B设计中,用户只看到一个处理变体。在交错设计中,用户同时暴露于多个处理中,这显著提高了测试的敏感性。

In this post, we dive into how we’ve implemented interleaving designs at DoorDash. We also explore how we’ve refined the design to be even more sensitive than what is reported in the industry (see Table 1), discuss the challenges we’ve faced, and provide recommendations for handling those challenges.

在本文中,我们深入探讨了DoorDash如何实现交错设计。我们还探讨了我们如何改进设计,使其比行业报告的结果更敏感(见表1),讨论了我们面临的挑战,并提供了处理这些挑战的建议。

Table 1: This table highlights reported sensitivity improvements across various companies that used interleaving. In this post, we explore why DoorDash has...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.137.3. UTC+08:00, 2024-11-26 09:21
浙ICP备14020137号-1 $访客地图$