使用Cachelib的推荐系统特征缓存

Li Tang; Sr. Software Engineer | Saurabh Vishwas Joshi; Sr. Staff Software Engineer | Zhiyuan Zhang; Sr. Manager, Engineering |

Li Tang; 高级软件工程师 | Saurabh Vishwas Joshi; 高级员工软件工程师 | Zhiyuan Zhang; 高级经理，工程 |

At Pinterest, we operate a large-scale online machine learning inference system, where feature caching plays a critical role to achieve optimal efficiency. In this blog post, we will discuss our decision to adopt Cachelib project by Meta Open Source (“Cachelib”) and how we have built a high-throughput, flexible feature cache by leveraging and expanding upon the capabilities of Cachelib.

在Pinterest，我们运营着一个大规模的在线机器学习推断系统，其中特征缓存在实现最佳效率方面起着关键作用。在本博客文章中，我们将讨论我们决定采用Meta Open Source的Cachelib项目（“Cachelib”），以及我们如何通过利用和扩展Cachelib的功能来构建一个高吞吐量、灵活的特征缓存。

Background

背景

Recommender systems are fundamental to Pinterest’s mission to inspire users to create a life they love. At a high level, our recommender models predict user and content interactions based on ML features associated with each user and Pin.

推荐系统是Pinterest激励用户创造自己喜爱的生活的基础。在高层次上，我们的推荐模型根据与每个用户和Pin相关的ML特征预测用户和内容的交互。

These ML features are stored in an in-house feature store as key-value datasets: the keys are identifiers for various entities such as pins, boards, links, and users, while the values follow a unified feature representation, defined in various schema formats including Thrift project by Meta Open Source and Flatbuffers project by Google Open Source.

这些机器学习特征以键值数据集的形式存储在内部特征存储中：键是各种实体（如钉子、板、链接和用户）的标识符，而值遵循统一的特征表示，以不同的模式格式定义，包括 Meta Open Source 的 Thrift 项目和 Google Open Source 的 Flatbuffers 项目。

A primary challenge involves fetching ML features to support model inference, which occurs millions of times per second. To optimize for cost and latency, we extensively utilize cache systems to lighten the load on the feature store.

一个主要的挑战是为了支持模型推断而获取ML特征，这每秒发生数百万次。为了优化成本和延迟，我们广泛利用缓存系统来减轻特征存储的负载。

Before adopting Cachelib, we had two in-process cache options within our C++ services:

在采用 Cachelib 之前，我们的 C++ 服务中有两个进程内缓存选项：

LevelDB (project by Google Open Source)-based ep...