在Apache Pinot中实现无限保留的Upsert表
Apache Pinot™ was originally designed as an append-only OLAP (online analytical processing) database. After some redesign, it was modified to support upserts, which are UPdates plus inSERTs. This allows you to update a record for a given primary key or insert new primary keys. Deletion is a natural extension of upserts, addressing the need for efficient memory and disk usage in upsert use cases that require indefinite retention periods with deletions based on specific business needs.
Apache Pinot™ 最初设计为仅追加的 OLAP(在线分析处理)数据库。经过一些重新设计后,它被修改为支持 upserts,即更新加插入。这允许您更新给定主键的记录或插入新的主键。删除是 upserts 的自然扩展,解决了在需要无限期保留期的 upsert 用例中基于特定业务需求进行删除的高效内存和磁盘使用问题。
This blog highlights recent feature developments in Apache Pinot that now support deletions at both memory and disk levels. It also shows how these developments have enabled Uber to sustainably support infinite retention for Pinot upsert use cases.
这篇博客重点介绍了Apache Pinot最近在内存和磁盘级别支持删除的功能开发。它还展示了这些开发如何使Uber能够可持续地支持Pinot upsert用例的无限保留。
Upsert is a feature of Pinot used for things like point updates, backfills, and data correction.
Upsert是Pinot的一个功能,用于点更新、回填和数据校正等。
Figure 1 presents a high-level overview of upsert architecture, highlighting how upserts are highly memory-intensive.
图1展示了更新架构的高级概述,突出了更新是如何高度占用内存的。
Figure 1: High-level architecture of upsert in Pinot.
图 1:Pinot 中 upsert 的高级架构。
Upsert-Metadata is an in-memory hashmap that maintains a mapping of Record-Primary-keys to Record-locations. The Record-Primary-key, a unique identifier, is used for partitioning upstream Kafka and serves as a reference for updates if they already exist in the Upsert-Metadata map. The Record-location points to the segment where the latest record for a given Record-Primary-key is stored. This entire Upsert-Metadata mapping is kept in memory for fast upsert operation, contributing to the high memory usage of upserts. To illustrate the memory-intensive nature of upserts, at Uber, our standard host with 376 GiB of memory and 1.1 TiB of disk storage experiences 80% memory utilization and approxim...