信任不可检验的事物:Doubly Robust Models 的验证与诊断

Sitemap

Sitemap

[

[

Lyft Engineering

Lyft 工程

](https://eng.lyft.com/?source=post_page---publication_nav-25cd379abb8-00853df009df---------------------------------------)

](https://eng.lyft.com/?source=post_page---publication_nav-25cd379abb8-00853df009df---------------------------------------)

[

[

Lyft Engineering

](https://eng.lyft.com/?source=post_page---post_publication_sidebar-25cd379abb8-00853df009df---------------------------------------)

](https://eng.lyft.com/?source=post_page---post_publication_sidebar-25cd379abb8-00853df009df---------------------------------------)

Stories from Lyft Engineering.

Lyft Engineering 的故事。

written by Ross Chu and Shima Nassiri

作者 Ross Chu Shima Nassiri

The Causal Frontier: Measurement Beyond Randomization

因果前沿:超越随机化的测量

The gold standard for determining the causal impact of a policy or product change at a company like Lyft is the A/B test (randomized experiment). By randomly assigning users to a treatment or control group, A/B tests inherently eliminate bias, providing clean estimates of the Average Treatment Effect (ATE). However, many critical business questions and large-scale initiatives simply cannot be randomized. This forces scientists to move past traditional experimentation and leverage quasi-experimental methods.

在像 Lyft 这样的公司中,确定政策或产品变更因果影响的金标准是 A/B test(随机化实验)。通过随机地将用户分配到治疗组或对照组,A/B test 天生消除了偏差,提供 Average Treatment Effect (ATE) 的干净估计。然而,许多关键业务问题和大规模举措根本 无法随机化。这迫使科学家们超越传统实验,并利用 准实验 方法。

We rely on non-randomized measurement in several key scenarios across Lyft:

我们在 Lyft 的几个关键场景中依赖非随机化测量:

  • Partnerships and Policies: Assessing the incremental impact of a partnership (e.g., linking two company accounts) is often a non-randomized assignment. Since these collaborations require coordinated operational work across both companies and are typically announced or promoted broadly, this makes controlled randomization impractical.
  • Partnerships and Policies: 评估合作伙伴关系(例如,链接两个公司账户)的增量影响通常是非随机分配的。由于这些合作需要两家公司协调的运营工作,并且通常被广泛宣布或推广,这使得控制随机化不切实际。
  • Long-Term Effect (LTE): Measuring effects that unfold over a lon...
开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.0. UTC+08:00, 2026-03-25 05:50
浙ICP备14020137号-1 $访客地图$