From Pre-trained to Fine-tuned: Nextdoor’ s Path to Effective Embedding Applications

摘要

The majority of ML models at Nextdoor are typically driven by a large number of features that are primarily either continuous or discrete in nature. The personalized features usually stem from historical aggregations or real-time summarization of interaction features, typically captured through logged tracking events. However, representing content through deep understanding using information behind it (text/image) is crucial for modeling nuanced user signals and better personalizing complex user behavior across many of our products. In the rapidly evolving field of NLP, utilizing transformer models to perform representation learning effectively and efficiently has become increasingly important for user understanding and improving their product experience.

Towards that, we have built a lot of entity embedding models spanning entities such as posts, comments, users, search queries & classifieds. We first leveraged deep understanding of content and used that to derive embeddings for meta entities like users based on their past interacted content. These powerful representations are found to be very crucial towards extracting meaningful features for some of the biggest ML ranking systems at Nextdoor such as notifications scoring and feed ranking. By making them readily available and building to scale, we can drive adoption of state-of-the-art reliably and put them in the hands of ML Engineers for rapidly building performant models across the company.

欢迎在评论区写下你对这篇文章的看法。

评论

Home - Wiki
Copyright © 2011-2024 iteam. Current version is 2.139.0. UTC+08:00, 2024-12-23 10:00
浙ICP备14020137号-1 $Map of visitor$