Improving Instagram notification management with machine learning and causal inference
- We’re sharing how Meta is applying statistics and machine learning (ML) to improve notification personalization and management on Instagram – particularly on daily digest push notifications.
- By using causal inference and ML to identify highly active users who are likely to see more content organically, we have been able to reduce the number of notifications sent while also improving overall user experience.
On Instagram, notifications play an important role in providing efficient communication channels between Instagram and our users. As the types of notifications have increased, a need has arisen to provide people with personalized notification experiences to help avoid them receiving excess notifications or ones they may not find to be important.
At Meta, we have been applying statistics and machine learning (ML) for notification personalization and management on Instagram. Today, we would like to share an example of how we used causal inference and ML to control sending for daily digest push notifications.
Moving beyond click-through rate models
A daily digest push notification about stories is a type of notification that lists a digest of stories that are shared and ready for someone to view. When such a notification is delivered to someone’s device, they may click on the notification to view the content. Traditionally, an ML model called a click-through rate (CTR) model is used to predict how likely someone is to click on a notification. CTR models have been working well in many applications across the industry. The predicted click probability is used as a proxy to indicate the notification’s quality to the user. If the predicted click probability is too low, the notification will be dropped in the middle of its sending flow and the user won’t receive the notification because it has been deemed low quality.
CTR model-based filtering worked well for daily digest notification in the sense that the actual average click rate when using the CTR model was significantly higher than without the model. However, we also noticed that using the CTR model meant a large portion of the daily digest notifications were sent to users who are relatively active in terms of using Instagram. For many highly active Instagram users, even without sending these daily digest notifications, they would be able to view the corresponding stories in an organic manner. This actually opens up an opportunity to provide a better user experience by sending fewer notifications to active users who are likely to view the stories listed in the notifications organically.
The challenging aspect is how to identify these users. If we reduce the notifications sent to users who are active based on them receiving such notifications, it’s possible those users will become less active. In other words, if the right users aren’t properly selected for sending reductions we risk creating a decline in user engagement.
Causal inference and ML
Essentially, it’s a user selection problem. We would like to maximize the efficiency of sent notifications by selecting proper user cohorts. The solution we adopted to tackle this problem is the combination of causal inference and ML.
For problem formulation, let’s assume there is a fixed computational cost to send each daily digest notification, and also there is a total budget for these notifications to spend. Now it becomes a budget allocation problem. The key to solving this problem is figuring out the incremental value of sending a daily digest notification compared to not sending. For example, the incremental value for user i can be defined in terms of user activeness, i.e., ui=Pri(active|do(send notification)) – Pri(active|do(drop notification)). For some user cohorts, they would be active without receiving the daily digest notifications and thus the incremental values would be small; selecting these cohorts to send the digest notifications is inefficient and may even spam these users. For better product experience and efficiency, we can sort the notifications by the incremental values in descending order and select the top notifications with high incremental values to send, to maximize the overall incremental value with limited budget (sending volume).
The next question is how to estimate the incremental value before we make the send or drop decision. It is a challenging question because for the same notification, we can either send it or drop it; there is no way to observe both scenarios. Essentially, this is a causal inference problem and uplift modeling techniques can be used. To apply uplift models, we designed a randomized experiment in which each notification was randomly sent or dropped, as illustrated in Figure 1.
Figure 1: A randomized experiment for uplift modeling, where a notification eligible to send will be dropped randomly with 50 percent probability.
Based on the data collected from this randomized experiment, we developed a neural network-based uplift model to predict the incremental impact between not sending and sending the daily digest notifications about stories at user level. Given the estimates of incremental impact for all notifications, the solution of the above budget allocation problem is trivial. However, in practice the notifications are generated and scored online and thus we cannot have incremental impact estimates ready for all candidate notifications in advance.
As a consequence, we need an online approach to determine which notifications to send or drop. One simple but effective solution is to compare the online generated score with a fixed threshold – if the score is higher than the threshold we can send it. By doing so, we intend to maintain a fixed notification sending rate r where 0 < r < 1.
When we applied this approach in online testing we observed sending rate fluctuations because the uplift (incremental impact) estimates generated from ML models may shift from time to time due to various reasons. To stabilize the sending rate, we utilize an online quantile computation service to transform the raw uplift estimates towards a standard uniform distribution while preserving the orders. To maintain the sending rate to r, we simply compare the transformed uplift estimate with r to make the sending decision, since the transformed uplift estimate Z~U(0,1), Pr( Z >= 1 – r ) = r. This process is illustrated in Figure 2.
Figure 2: Order-preserving score transformation for sending rate control
Better notifications with causal inference and ML
By applying this model and targeting the users + notifications with high incremental impact, we reduced the sending volume substantially compared to using the CTR model and also saw no decline in user engagement. The benefit of this work is twofold: improved user experience and reduced resource usage.
In the Instagram Notifications Systems team, ML and statistics have been applied in different areas to improve user notification experience. If you want to learn more about this work or are interested in joining one of our engineering teams, please visit our careers page, and follow us on Facebook.