Airbnb如何实现规模化的一致数据消费
By: Amit Pahwa, Cristian Figueroa, Donghan Zhang, Haim Grosman, John Bodley, Jonathan Parks, Jenny Liu, Krishna Bhupatiraju, Maggie Zhu, Mike Lin, Philip Weiss, Robert Chang, Shao Xie, Sylvia Tomiyama, Toby Mao, Xiaohui Sun
By: Amit Pahwa,Cristian Figueroa,Donghan Zhang,Haim Grosman,John Bodley,Jonathan Parks,Jenny Liu,Krishna Bhupatiraju,Maggie Zhu,Mike Lin,Philip Weiss,Robert Chang,Shao Xie,Sylvia Tomiyama,Toby Mao,Xiaohui Sun
Introduction
简介
In the first post of this series, we highlighted the role Minerva plays in transforming how Analytics works at Airbnb. In the second post, we dove into Minerva’s core compute infrastructure and explained how we enforce data consistency across datasets and teams. In this third and final post, we will focus our story on how Minerva drastically simplifies and improves the data consumption experience for our users. Specifically, we will showcase how a unified metric layer, which we call the Minerva API, helps us build versatile data consumption experiences tailored to users with a wide range of backgrounds and varying levels of data expertise.
在本系列的第一篇文章中,我们强调了Minerva在改变Airbnb的分析工作方式方面所发挥的作用。在第二篇文章中,我们深入探讨了Minerva的核心计算基础设施,并解释了我们如何在不同的数据集和团队中实施数据一致性。在第三篇也是最后一篇文章中,我们将重点讲述Minerva是如何大幅简化并改善用户的数据消费体验的。具体来说,我们将展示一个统一的指标层,也就是我们所说的Minerva API,如何帮助我们建立多功能的数据消费体验,为具有广泛背景和不同数据专业水平的用户量身定做。
A Metric-Centric Approach
一个以指标为中心的方法
When data consumers use data to frame a business question, they typically think in terms of metrics and dimensions. For example, a business leader may wonder what percentage of bookings (a metric) is made up of long-term stays (a dimension). To answer this question, she needs to find the right set of tables from which to query (where), apply the necessary joins or filters (how), and then finally aggregate the events (how) to arrive at an answer that is, hopefully, correct.
当数据消费者使用数据来构建一个商业问题时,他们通常以指标和维度来思考。例如,一个企业领导可能会想知道,长期住宿(一个指标)在预订中占多大比例(一个维度)。为了回答这个问题,她需要找到正确的表组来进行查询(where),应用必要的连接或过滤(how),然后最后汇总事件(how),得出一个希望是正确的答案。
While many traditional BI tools attempt to ...