Pinterest上重新设计的Goku-Ingestor vNext一瞥

Xiao Li, Kapil Bajaj, Monil Mukesh Sanghavi and Zhenxiao Luo

Xiao Li,Kapil Bajaj,Monil Mukesh Sanghavi和Zhenxiao Luo

Introduction

介绍

In the dynamic arena of real-time analytics, the need for precision and speed is non-negotiable. Pinterest’s real-time metrics asynchronous data processing pipeline, powering Pinterest’s time series database Goku, stood at the crossroads of opportunity. The mission was clear: identify bottlenecks, innovate relentlessly, and propel our real-time analytics processing capabilities into an era of unparalleled efficiency.

在实时分析的动态领域中,精确性和速度的需求是不可妥协的。Pinterest 的实时指标异步数据处理流水线,为 Pinterest 的时间序列数据库 Goku 提供动力,正处于机遇的十字路口。使命很明确:识别瓶颈,不断创新,将我们的实时分析处理能力推向无与伦比的效率时代。

Background

背景

The Goku-Ingestor is an asynchronous data processing pipeline that performs multiplexing of metrics data. It performs data validation, denylist processing, sharding, deserializing multiple metrics formats, and serializing the data into a customized Time Series Database (TSDB) format that can be used by downstream storage engine: Goku.

Goku-Ingestor是一个异步数据处理流水线,用于对指标数据进行多路复用。它执行数据验证、拒绝列表处理、分片、反序列化多种指标格式,并将数据序列化为定制的时间序列数据库(TSDB)格式,可供下游存储引擎使用:Goku

Pinterest metrics system

Pinterest指标系统

Goku-Ingestor has been running and evolving for close to a decade. It did the work fairly well despite some caveats that became pain points for the real-time analytics platform.

Goku-Ingestor 已经运行和发展了近十年。尽管在实时分析平台中存在一些问题,但它的工作还是相当不错的。

High Fleet Cost for Perceived Throughput

感知吞吐量的高成本

We measure the throughput of Goku-Ingestor using data points per min. Generally Goku-Ingestor has a throughput of 2.5 billion — 5 billion data points per minute. To achieve the throughput, Goku-Ingestor uses thousands of memory optimized EC2 instances and incurs a higher infra cost, despite per host throughput is less than 0.5 mbps.

我们使用每分钟数据点来衡量 Goku-Ingestor 的吞吐量。通常,Goku-Ingestor 的吞吐量为每分钟 25 亿到 50 亿个数据点。为了实现这种吞吐量,Goku-Ingestor 使用了数千个内存优化的 EC2 实例,并产生了更高的基础设施成本,尽管每个主机的吞吐量低于 0.5 mbps。

Reliability Issues

可靠性问题

In the initial months of 2023, certain problems arose as a res...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.123.1. UTC+08:00, 2024-03-01 19:05
浙ICP备14020137号-1 $访客地图$