Using Marketplace Marginal Values to Address Interference Bias

As illustrated in Figure 2, when supply is abundant, the marginal value of having rider R1 matches its face value, which is $6. However, in a low supply scenario where resources are limited, the resource is allocated to rider R2. Consequently, both the marginal and face values for R1 become zero. For rider R2, the face value is $10, but its marginal value is only the additional $4 gained by having rider R2. This demonstrates how the marginal value inherently accounts for resource contention. By aggregating the marginal values across both the treatment and control groups and calculating the difference, one can derive an unbiased estimator of the average treatment effect.

How to compute MMVs?

As previously mentioned, shadow prices in the dispatch optimization problem can be used to obtain the MMVs. The primal dispatch problem can be described as follows:

Where xᵢⱼ is a variable that takes the value of 1 if the driver j got matched to that rider i, and 0 otherwise. πᵢⱼ represents the score (e.g., profit) of matching driver j to rider i. The first constraint ensures that a driver is matched with at most one ride per a matching cycle (more on this later), and the second constraint indicates that a rider can have at most one driver. Solving this optimization gives the optimal matching of drivers to riders. We can relax the last constraint into xᵢⱼ ≥ 0, and obtain a linear relaxation of the above problem for which we can compute the dual as:

The dual variable μⱼ is associated to the driver constraint (first primal constraint), and λᵢ is associated with the rider constraint. This means that for each driver j, there is an associated dual variable μⱼ (same is true for riders). More on duality can be found here. To find the MMV values, we aim at generating a matching cycle dispatch graph, solve it, and then efficiently compute the incremental values via the duals. Consider the objective function, denoted as Π(d,s), where d and s represent the demand and supply respectively. Assume that the treatment effect results in increasing the demand by e. Then the global effect of such treatment can be presented as follows:

Now to estimate Δ, we can do a rider-split 50/50 A/B test where each group serves half the demand. Consider the demand in each group being presented by dₑ. We then have

The global average treatment effect can then be estimated as:

where λ* is the optimal rider dual or shadow price. Here, the analysis provides the first order Taylor approximation results — for more details see Proposition 5 of Bright et al. (2024). We can observe that the difference in the objective function outcomes across treatment and control groups can be presented by the shadow price. In the paper, the authors further did a simulation and showed this shadow price estimator will hamper the overestimation of the default estimates from standard A/B tests while lowering the noise level (refer to Figure 8 in Bright et al. (2024)).

Matching Cycle

Next, we need to decide how often to solve these optimization problems, essentially determining the length of the matching cycle. If the matching cycle is too short, contention can occur between cycles. For instance, a driver who isn’t matched in Cycle 1 might be available in the next cycle, or a rider choosing the wait-and-save option might wait several cycles before being matched. At Lyft, we use a 1-hour mega cycle to solve the dispatch optimization problem for all eligible riders and drivers within that period. This cycle length helps significantly reduce concerns about contention between cycles.

Secondary Metrics

Finally, if we want to assess the MMV-corrected impact of a treatment on metrics beyond those defined by the dispatch objective function, we can compute the edge or ride cost for each completed ride (e.g., νᵢⱼ). Considering a linear relaxation of the primal problem and applying complementary slackness, we have:

Assuming non-degeneracy, we can then solve the above system of equations to find the optimal dual values and use them to estimate the average treatment effect same as before.

Productionalizing MMV in an Experimentation Platform

To implement MMV in an experimentation platform, we solve the matching optimization problem for passenger and driver duals on an hourly basis, as previously described, and store these values in a table. This data is then used to calculate MMV-corrected values for drivers and riders. These metrics are included in experiment reports alongside other metrics, with standard computations like CUPED applied to them. Below is an example of how an MMV-corrected metric might appear in a driver randomized experiment. Here, riders are the congested resource contributing to the interference bias, and over-estimation of the results as discussed earlier. The MMV correction would hamper the effect by accounting for the contention over the limited resource (in this case pool of riders).

It’s important to note that there are limitations to the use cases for MMV. For instance, MMV cannot be applied in situations where the target population for randomization is not drivers or passengers. An example of this would be mapping experiments where the route is associated with the ride itself, rather than being specific to drivers or passengers, making MMV-corrected metrics unsuitable.

MMVs in Practice

Other experimental designs like time-split, region-split, or a combination of time- and user-splits often fall short in addressing the majority of experimentation needs. They tend to lack sufficient power, are costly to implement, and can take several weeks to execute. In contrast, the MMV approach can adjust the effect sizes in user-split experiments where interference is a concern. This is particularly important in cases with significant network effects, as the change in effect magnitude can be substantial, potentially altering launch decisions under MMV correction.

At Lyft, we’ve had instances where both time- and user-split experiments were conducted for the same initiative to capture both market-level and long-term effects. We compared the MMV-corrected user-split outcomes with the time-split outcomes in three historical cases where a time-split counterpart was available. After applying MMV correction, we observed greater alignment with the time-split results.

Additionally, a comprehensive backtest across various user-split experiments was conducted, comparing MMV-corrected completed rides with traditional metrics. In 10% of these comparisons, the launch decision could change when using MMV-corrected values. These cases were evenly split between false positives (launching based on traditional values when MMV-corrected values didn’t meet launch criteria) and false negatives.

Moreover, when MMV results show a lower magnitude compared to naive user-split results, particularly in resource-constrained experiments, we anticipate an average 45% reduction in outcome magnitude based on this numerical study. This decrease occurs because the contribution of each ride is divided between the rider and the driver when calculating marginal values, thus correcting for the overestimation of effects due to interference bias.