优化 ML Workload 网络效率(第一部分):Feature Trimmer

Guangtong Bai | Staff Software Engineer, Product ML Infrastructure*; Shantam Shorewala | Software Engineer II, Product ML Infrastructure*; Chi Zhang | Staff Software Engineer, AI Platform*; Neha Upadhyay | Software Engineer II, AI Platform*; Haoyang Li | Director, Product ML Infrastructure

Guangtong Bai | 资深软件工程师, Product ML Infrastructure*; Shantam Shorewala | 软件工程师 II, Product ML Infrastructure*; Chi Zhang | 资深软件工程师, AI Platform*; Neha Upadhyay | 软件工程师 II, AI Platform*; Haoyang Li | 总监, Product ML Infrastructure

*These authors contributed equally to this article.

*这些作者对本文贡献平等。

Background

背景

At Pinterest, our online ML serving systems employ a root-leaf architecture. On a high level, the architecture looks as follows:

在 Pinterest,我们的在线 ML serving 系统采用了 root-leaf 架构。在高层级,该架构如下所示:

Figure 1: Root-leaf Architecture of Online ML Serving Systems at Pinterest

图 1:Pinterest 在线 ML 服务系统的根-叶架构

In the diagram, “Client Service” is responsible for recommending organic or promoted Pins to users. In order to know if a given Pin is relevant to a particular user request, client service sends a score request to the online ML serving system to have the Pin scored by a bunch of ML models, each of which scores an aspect of “relevancy”.

在图中,“Client Service” 负责向用户推荐有机或推广的 Pins。为了知道给定的 Pin 是否与特定用户请求相关,client service 向在线 ML 服务系统发送 score 请求,让一堆 ML 模型对 Pin 进行评分,每个模型评分“relevancy”的一个方面。

The online ML serving system is composed of 2 parts:

online ML serving system 由 2 个部分组成:

  1. Root: This component handles initial feature processing. Its responsibilities include retrieving necessary features from the feature store, performing required preprocessing, and distributing (fanning out) the scoring requests to the various leaf partitions.
  2. Root: 此组件处理初始特征处理。其职责包括从 feature store 检索必要特征、执行所需预处理,并将评分请求分发(扇出)到各种 leaf partitions。
  3. Leaf: This is where the actual model inference takes place, typically utilizing GPU machines. It is structured into multiple partitions, each of which hosts a related group of models, such as one production model and several experiment...
开通本站会员,查看完整译文。

ホーム - Wiki
Copyright © 2011-2026 iteam. Current version is 2.155.2. UTC+08:00, 2026-05-06 11:14
浙ICP备14020137号-1 $お客様$