带有云段存储的Pinot实时摄取量

Pinot Real-Time Ingestion with Cloud Segment Storage

Apache Pinot is an open source data analytics engine (OLAP), which allows users to query data ingested from as recently as a few seconds ago to as old as a few years back. Pinot’s ability to ingest real-time data and make them available for low-latency queries is the key reason why it has become an important component of Uber’s data ecosystem. Many products built in Uber require real-time data analytics to operate in our mobile marketplace for shared rides and food delivery. For example, the chart in Figure 1 shows the breakdown of Uber Eats job states over a period of minutes. Our Uber Eats city operators need such insights to balance marketplace supply and demand, and detect ongoing issues. 

Apache Pinot是一个开源的数据分析引擎(OLAP),它允许用户查询从最近几秒钟前到几年前摄入的数据。Pinot能够摄取实时数据并使它们可用于低延迟查询,这是它成为Uber数据生态系统的重要组成部分的关键原因。在Uber中构建的许多产品都需要实时数据分析,以便在我们的共享乘车和送餐的移动市场中运作。例如,图1中的图表显示了几分钟内Uber Eats工作状态的细分。我们的Uber Eats城市运营商需要这样的洞察力来平衡市场的供应和需求,并检测正在发生的问题。

Figure 1: UberEats job state breakdown in the past X minutes

图1:UberEats在过去X分钟的工作状态细分

At a high level, Figure 2 (Credit: Apache Pinot Docs) shows a Pinot cluster consisting of several components: Pinot controller, Pinot server, Pinot broker, and a segment store. Pinot controller is a metadata service which controls the state of a Pinot cluster. Pinot server is the data node that ingests and stores both realtime and batch data. It also serves queries sent by Pinot brokers. Pinot broker is the query gateway which scatters a user query to Pinot servers and gathers the results. A segment store has staged Pinot data segments for the purposes of both data backup and download.    

在高层次上,图2(来源:Apache Pinot Docs)显示了一个由几个组件组成的Pinot集群。Pinot控制器、Pinot服务器、Pinot中介和一个段存储。Pinot控制器是一个元数据服务,控制Pinot集群的状态。Pinot服务器是数据节点,它摄取并存储实时和批量数据。它还为Pinot经纪人发送的查询提供服务。Pinot经纪人是查询网关,它将用户查询分散到Pinot服务器并收集结果。段落商店有分阶段的Pinot数据段,用于数据备份和下载的目的。

Figure 2. Pinot High-Level Architecture

图2.皮诺高级架构

Pinot’s real-time consumer in each Pinot server organizes the incoming stream data into smaller chunks called Pinot segments. During data ...

开通本站会员,查看完整译文。

Home - Wiki
Copyright © 2011-2024 iteam. Current version is 2.134.0. UTC+08:00, 2024-09-29 00:19
浙ICP备14020137号-1 $Map of visitor$