Lyft的因果预测(第一部分)
By Duane Rich and Sameer Manek
作者:Duane Rich和Sameer Manek
Efficiently managing our marketplace is a core objective of Lyft Data Science. That means providing meaningful financial incentives to drivers in order to supply affordable rides while keeping ETAs low under changing market conditions — no easy task!
有效地管理我们的市场是Lyft数据科学的一个核心目标。这意味着向司机提供有意义的经济激励,以便在不断变化的市场条件下提供负担得起的乘车服务,同时保持较低的ETA值--这不是一件容易的事。
Lyft’s tool chest contains a variety of market management products: rider coupons, driver bonuses, and pricing, to name a few. Using these efficiently requires a strong understanding of their downstream consequences — everything from counts of riders opening the Lyft app (“sessions”) to financial metrics.
Lyft的工具箱里有各种市场管理产品:乘客优惠券、司机奖金和定价,等等。有效地使用这些产品需要对其下游后果有深刻的理解--从乘客打开Lyft应用程序的次数("会话")到财务指标。
To complicate the science further, our data is heavily confounded by our previous decisions, so a merely correlational model would fail us. Sifting out causal relationships is the only option for making smart forward looking decisions.
使科学进一步复杂化的是,我们的数据被我们以前的决定严重混淆,所以仅仅是一个相关的模型会让我们失败。筛选出因果关系是做出明智的前瞻性决策的唯一选择。
In two blog posts, we’ll explain our solution, an internal product we will refer to as Lyft’s “Causal Forecasting System¹”. The first (this post) will discuss the business problem, basic principles and modeling techniques. The second will describe our software used to actualize these ideas and apply them at scale. In doing so, we’ll cover our use of causal inference, causal modeling, and PyTorch to develop a large model, containing Lyft’s consensus view of our business, which ultimately drives large capital allocating decisions. Let’s begin!
在两篇博文中,我们将解释我们的解决方案,一个我们将称为Lyft的 "因果预测系统¹"的内部产品。第一篇(这篇文章)将讨论商业问题、基本原则和建模技术。第二篇将描述我们用于实现这些想法并大规模应用的软件。在此过程中,我们将介绍我们对因果推理、因果建模和PyTorch的使用,以开发一个大型模型,包含Lyft对我们业务的共识,最终推动大型资本分配决策。让我们开始吧!
The Task
任务
Lyft is internally organized around product-focused teams. These include teams focused on driver bonuses, rider coupons, pricing, and activating new drivers. Each is primarily focused on metrics dire...