Launch Control at Nextdoor

How engineers configure and deploy A/B tests and feature flags

In this article, we share our experience building Launch Control, Nextdoor’s combined feature flagging and experiment configuration tool. One of Nextdoor’s core values is “Experiment and Learn Quickly”, and one of our engineering principles is “Move Fast — Build Iteratively”. We believe fast iteration on our products and features is a great way to bring better value to our neighbors around the world. Teams at Nextdoor routinely use data from countless experiments to inform product improvements. Moreover, in an environment where it’s impractical to ship native mobile apps more than about once per week, we also make frequent use of feature flags as a way to safely and gradually release new products to our neighbors. Both of these needs — experimentation and feature flagging — require robust internal tools and strong developer education to be used at scale.

One of the most unique things about Launch Control is how it was built as a strong and ongoing cross-functional collaboration between engineers from all different teams. Its creation came about when a backend product engineer identified opportunities to make the legacy AB and Feature Config tools better. Although they didn’t officially work on internal tools, we strongly believe in ownership and empowerment at Nextdoor, so we ensured they had the space and support to quickly iterate on a prototype. Once this prototype had enough features to get adoption, we made more and more room for that engineer to contribute, and made sure to recognize their impact to Nextdoor engineering.

After this, Launch Control grew into a shared labor of love across Nextdoor — it has key contributions from many of our best engineers across different teams and stacks, from its core backend components, through a delightful React-powered user interface, all the way to its APIs that integrate experiments and feature flags into our Android and iOS mobile apps. A recurring ritual in Launch Control development is to empower and support engineers who identify ongoing feature improvements to make those improvements themselves. This generates a strong sense of camaraderie across teams and helps spread technical knowledge of how the Launch Control stack works.

Evolving two internal tools into one

Before Launch Control, we used two separate tools for experimentation and feature flagging, creatively named Feature Config and AB. These tools suffered from a number of technical limitations. For example, Feature Config allowed engineers to use a rich set of user features, such as their Nextdoor neighborhood, country, or app version, but only produced a binary true/false decision, making it impractical for experiments which need multiple treatment groups. In contrast, the AB tool could output custom treatment groups, but its targeting capabilities were limited to basic percentage-based rollouts.

Additionally, both tools had sparse and uninviting user interfaces, making it difficult for non-technical users to read, or contribute to, experiments and flags. Launch Control was designed to supersede these tools, with a friendly user interface, and a rich set of targeting capabilities, as well as support for arbitrary treatment groups.

Launch Control: easy to understand user targeting

At its core, each individual Launch Control experiment represents a function whose inputs are a set of parameters about a Nextdoor neighbor, such as their id, city, mobile platform, etc., and whose output is a specific treatment group:

Each Launch Control experiment encodes a mapping between parameters about a Nextdoor neighbor and a specific treatment group for that experiment.

When deciding how to encode these functions, we tried to strike a balance between simplicity and expressive power. Our legacy Feature Config tool, for example, allows arbitrarily nested boolean and/or clauses which check for things such as country allow-lists and percentage rollouts. Unfortunately, that tool made it difficult to express complex relationships in a readable way, leading us to occasionally mistarget rollouts, and discouraging non-technical employees from participating in experimentation.

For Launch Control, we found that a non-nesting linear sequence of Condition Blocks, each of which is a combination of targeting features, is a great balance between readability and targeting power. When evaluating a particular Launch Control, the algorithm iterates over an experiment’s Condition Blocks, stopping at the first block which successfully captures a user. That block then assigns a specific treatment group to the user, and evaluation ends. In the case that no Condition Block captures a user, Launch Control automatically returns the reserved “untreated” treatment group for that user, indicating that they are not part of the A/B test, or should not get the feature flag in question.

Condition Blocks are and-combinations of individual targeting features, evaluated from top to bottom. Launch Control assigns a treatment group based on the first Condition Block that “captures” a user.

Condition Blocks: the heart of Launch Control

Launch Control supports many different types of Condition Blocks, which allow for targeting based on a variety of user features:

  • Individual allow-lists: These blocks allow us to target individual Nextdoor users. This is particularly useful in the early stages of developing a new feature, as we can target internal team members for dogfooding and testing long before the feature is ready for prime-time.
  • Percentage-based rollouts: Percentage rollouts allow us to encode things like standard A/B tests, where we introduce a new feature to a small subset of our users, comparing those users’ key metrics with a similar-sized control group. In addition, percentage rollouts also give us the ability to gradually release improvements to our users, a few percentage points at a time.
  • Geo-targeting: Some Condition Blocks allow us to target users based on their neighborhood’s city, state, and country. This allows us to iterate on features we decide to launch at different times for different markets (e.g., if we want to iterate on copy for international markets separately from the US).
  • Delegation: It’s possible to configure a Condition Block that delegates its decision to another Launch Control experiment. This allows teams to build a hierarchy of flags, providing an easy-to-use mechanism for feature switches that are progressively broader.

There is some interesting nuance in how a stochastic, percentage-based Condition Block needs to work. For most features, we expect the same user to always be either in, or out, of an experiment. It would be unreasonable to make a random selection every time a user is evaluated in a condition block, as users would be surprised when they occasionally find themselves jumping between having and not having a new product experience! Launch Control ensures this random-but-deterministic behavior by using a hash of users’ ids to resolve membership in probabilistic blocks. Briefly, that computation looks like this:

hashed_value = int(sha1(user_id).hexdigest**())
**hashed_probability = (hashed_value % 10000) / 10000
return hashed_probability < rollout_percentage

The method above ensures users always get consistent treatment groups from each Condition Block, without the need to store an explicit mapping between users and treatment groups on a database table or in memory at any time. For completeness, Launch Control does also offer a “Dice” block, which is a truly random determination, although its use cases are relatively rare.

However, even with a hash-based approach, there is still an additional trap we need to avoid: although the same user should get the same treatment group for a particular Condition Block, we expect users to get different treatment groups for different Condition Blocks. At any time, there may be dozens of different features, all configured as, e.g., 50/50 splits. If we rely on user ids alone for hashing, we would run into users that end up either in all features, or no features at all! Launch Control avoids this by concatenating a per-Condition Block salt to user ids before hashing them. This guarantees each user always gets the same result from a particular Condition Block, while getting potentially different results from different Condition Blocks. When we incorporate a salt into the evaluation, the previous algorithm looks like this:

salted_string = f'{condition_block_salt}{user_id}'
hashed_value = int(sha1(salted_string).hexdigest**())
**hashed_probability = (hashed_value % 10000) / 10000
return hashed_probability < rollout_percentage

The targeting engine described above is only one component of Launch Control. We also provide performant, easy-to-use APIs for engineers to query experiments across our tech. stack, in our application backend as well as our React frontend and native Android and iOS clients.

Backend vs. Frontend Launch Control APIs

Nextdoor engineers query Launch Control experiments in many places in our codebase: in our backend as we serve web requests to our clients, as well as in our desktop and mobile-web front ends, and finally in our native Android and iOS applications. In all cases, we provide APIs with two specific goals in mind: ease of use and fast, reliable performance. It is imperative that evaluating an experiment take no more than a few hundred microseconds, as complex user features like our main feed requires dozens of individual flags and experiments to render.

Launch Control experiments are stored internally as Nextdoor Sitevars. Because of this, our backend containers automatically have access to cached, in-host payloads for all relevant experiment definitions. This makes it relatively easy to have the backend Launch Control APIs be performant. Evaluation typically involves a small amount of CPU operations, and thanks to the Sitevars cache, requires no RPCs or network calls. Even despite this, however, Launch Control also maintains a per-web request cache of evaluated experiments, ensuring that each experiment is evaluated at most once, per user and per request.

On the other hand, the frontend and mobile APIs present more of a challenge. Since this code runs far from our servers, it’s impractical to cache and constantly synchronize all experiment definitions. It would also be prohibitively expensive to make each individual experiment API call involve a network operation. Because of this, we provide two flavors of Launch Control APIs on our clients.

The most commonly used flavor is a local API, which relies on the client having a known, pre-fetched list of experiments available in memory at all times. Clients run a single network request which fetches all experiment evaluations in this list at key application lifecycle moments (such as user login, session refresh, etc.), and those results are then available for code to query against in a synchronous fashion. We also provide an asynchronous API, which does allow engineers to make individual network requests for each experiment. This API only exposes non-blocking components, however (such as Observables on Android and Promises on web), to make its asynchronous nature clear and self-enforcing. While the local API is used for the most common experiments users are exposed to, the asynchronous flavor helps prevent rarer use experiments from excessively bloating the prefetch list. Finally, while it is out of scope for this article, Launch Control also provides specific APIs for engineers to run experiments in special cases, such as login and sign up screens, where we do not yet have a particular user available.

Other Features

The most important components of Launch Control are its targeting algorithms and query APIs, as described above. However, it also has a number of convenience and ease-of-use features, designed to make experimentation and feature flagging at Nextdoor a delightful experience. We strongly believe that engineers deserve tools that are as good as the products we ship to our neighbors around the world. Some of the other important Launch Control features include:

  • Contextual editing UI: The UI for each Condition Block type is unique and has deep knowledge of that block’s context. For example, allow-lists for individual users expose a generic search and typeahead. This allows us to add users by their name, email, or other traits, eliminating the need for people to memorize ids, and empowering team XFN partners such as Product Managers and Designers to add themselves to experiments directly.
  • Versioning and fast reverts: Launch Control experiments are stored internally as Nextdoor Sitevars, which have built-in support for versioning. Whenever a Launch Control is updated, we store a new version of it, with useful metadata such as edit time and author. This allows us to easily navigate through the full history of an experiment, eliminating the need to maintain redundant documentation on when a particular rollout changed. In addition, storing every version of an experiment also allows us to quickly revert features to known-good states in case anything goes unexpectedly sideways with a release.
  • Built-in observability and update subscriptions: Every Launch Control automatically publishes useful real time statistics whenever it’s evaluated, for all users. This allows teams to verify, in real time, that their experiments are going out to the expected number of users, and in the right proportions. In addition, employees can also subscribe to Launch Control experiments, so they get automatically notified when an experiment is updated.

Conclusion

Launch Control experiments drive many of our recent Nextdoor product improvements, such as changes to Notifications, Search, and our Business Experience. By striking a balance between expressive power, simplicity, and usability, we were able to collaboratively build a tool that experienced widespread internal adoption.

Have you worked on, or used, feature targeting and A/B testing tools in your career? Let us know in the comments below what you’ve learned works and what doesn’t, and if you’re excited about collaborating with other great engineers on impactful work like this, check out our careers page! We have open opportunities across different teams and functions.

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-23 11:33
浙ICP备14020137号-1 $访客地图$