在Pinterest建立大规模学习检索系统
Bowen Deng | Machine Learning Engineer, Homefeed Candidate Generation; Zhibo Fan | Machine Learning Engineer, Homefeed Candidate Generation; Dafang He | Machine Learning Engineer, Homefeed Relevance; Ying Huang | Machine Learning Engineer, Curation; Raymond Hsu | Engineering Manager, Homefeed CG Product Enablement; James Li | Engineering Manager, Homefeed Candidate Generation; Dylan Wang | Director, Homefeed Relevance; Jay Adams | Principal Engineer, Pinner Curation & Growth
Bowen Deng | 机器学习工程师,Homefeed 候选生成;Zhibo Fan | 机器学习工程师,Homefeed 候选生成;Dafang He | 机器学习工程师,Homefeed 相关性;Ying Huang | 机器学习工程师,策展;Raymond Hsu | 工程经理,Homefeed CG 产品启用;James Li | 工程经理,Homefeed 候选生成;Dylan Wang | 主管,Homefeed 相关性;Jay Adams | 首席工程师,Pinner 策展与增长
Introduction
引言
At Pinterest, our mission is to bring everyone the inspiration to create a life they love. Finding the right content online and serving the right audience plays a key role in this mission. Modern large-scale recommendation systems usually include multiple stages where retrieval aims at retrieving candidates from billions of candidate pools, and ranking predicts which item a user tends to engage from the trimmed candidate set retrieved from early stages [2]. Fig 1 illustrates a general multi-stage recommendation funnel design in Pinterest.
在Pinterest,我们的使命是为每个人带来创造他们所爱的生活的灵感。在网上找到合适的内容并为合适的受众提供服务在这个使命中扮演着关键角色。现代大规模推荐系统通常包括多个阶段,其中检索旨在从数十亿候选池中检索候选项,而排名则预测用户倾向于从早期阶段检索的修剪候选集中参与的项目[2]。图1展示了Pinterest中一般的多阶段推荐漏斗设计。
Fig 1. General multi-stage recommendation system design in Pinterest. We retrieve candidates from billions of Pin content corpus and narrow it down to thousands of candidates for the ranking model to score and finally generate the feeds for Pinners. “CG” is short for candidate generation and “LWS” is short for Light-weight Scoring, which is our pre-ranking model.
图1. Pinterest中的一般多阶段推荐系统设计。我们从数十亿个Pin内容库中检索候选项,并将其缩小到数千个候选项,以供排名模型评分,最终为Pinners生成信息流。“CG”是候选生成的缩写,“LWS”是轻量级评分的缩写,这是我们的预排名模型。
The Pinterest ranking model is a powerful transformer based model learned from a raw user en...