Riverbed数据脱水—第1部分
A deep dive into the streaming aspect of the Lambda architecture framework that optimizes how data is consumed from system-of-record data stores and updates secondary read-optimized stores at Airbnb.
深入研究Lambda架构框架中的流式方面,优化了如何从系统记录数据存储中消耗数据,并更新Airbnb的二级读优化存储。
Overview
Overview
In our previous blog post we introduced the motivation and high-level architecture of Riverbed. As a recap, Riverbed is a part of Airbnb’s tech stack designed to streamline and optimize how data is consumed from system-of-record data stores and update secondary read-optimized stores. The framework is built around the concept of ‘materialized views’ — denormalized representations of data that can be queried in a predictable, efficient manner. The primary goal of Riverbed is to improve scalability, enable more efficient data fetching patterns, and provide enhanced filtering and search capabilities for a better user experience. It achieves this by keeping the read-optimized store up-to-date with the system-of-record data stores, and by making it easier for developers to build and manage pipelines that stitch together data from various data sources.
在我们的上一篇博客文章中,我们介绍了Riverbed的动机和高级架构。回顾一下,Riverbed是Airbnb技术栈的一部分,旨在简化和优化从系统记录数据存储中消耗数据并更新次要的读优化存储。该框架围绕“物化视图”的概念构建,即可以以可预测、高效的方式查询的数据的非规范化表示。Riverbed的主要目标是提高可扩展性,实现更高效的数据获取模式,并提供增强的过滤和搜索功能,以提供更好的用户体验。它通过使读优化存储与系统记录数据存储保持同步,并使开发人员更容易构建和管理从各种数据源中拼接数据的流水线来实现这一目标。
In this blog post, we will delve deeper into the streaming aspect of the Lambda architecture framework. We’ll discuss step by step its critical components and explain how it constructs and sinks the materialized view from the Change Data Capture (CDC) events of various online data sources. Specifically, we’ll take a closer look at the join transformation within the Notification Pipeline, illustrating how we designed a DAG-like data structure to efficiently join different data sources together in a memory-efficient manner.
在本博客文章中,我们将更深入地探讨Lambda架构框架的流式处理方面。我们将逐步讨论其关键组件,并解释它如何从各种在线数据源的Change Data Capture(CDC)事件中构建和接收物化视图。具体而言,我们将更详细地介绍通知流水线中的连接转换...