Sparkle:Uber标准化模块化ETL
Uber’s data ecosystem comprises a complex and diverse big data landscape, operating at exabyte-scale and composed of a wide variety of tools to cater to each need such as ingestion layer (Apache Kafka®) and real-time compute (Apache Flink®), real-time analytics (Apache Pinot™), batch compute and aggregation layer (Spark ETL, Presto ETL, uWorc), batch analytics (Query Builder), ML studio (For building ML models), visualization (Tableau, Google Studio), different types of data stores (DocStore, MySQL™, Apache Hive™, Apache Hudi, TerraBlob), etc.
Uber的数据生态系统包括一个复杂而多样的大数据环境,以exabyte为规模,并由各种工具组成,以满足每个需求,例如摄取层(Apache Kafka®)和实时计算(Apache Flink®),实时分析(Apache Pinot™),批处理计算和聚合层(Spark ETL,Presto ETL,uWorc),批处理分析(Query Builder),ML工作室(用于构建ML模型),可视化(Tableau,Google Studio),不同类型的数据存储(DocStore,MySQL™,Apache Hive™,Apache Hudi,TerraBlob)等。
In 2023, the Uber Data platform migrated all batch workloads to Apache Spark™-based computation. Around 20,000+ critical pipelines and datasets are used to power the batch workloads and more than 3,000+ engineers are responsible for creating pipelines and owning datasets.
在2023年,Uber数据平台将所有批处理工作负载迁移到基于Apache Spark™的计算。约有20,000个关键流水线和数据集用于支持批处理工作负载,超过3,000个工程师负责创建流水线和拥有数据集。
Figure 1: Data Technology Stack At Uber.
图1:Uber的数据技术栈。
Uber has standardized the backend development flow where 5,000+ services are being built and managed by thousands of backend engineers. JFX is the application framework built on top of Java Spring Boot service and UberFx is the framework built for GO language-based service to assist developers in improving productivity. These frameworks make it easy for developers to write composable, testable apps using dependency injection. It removes boilerplate, global state, and package-level init functions. This also eliminates the need for service owners to install and manage individual libraries manually and provides multiple components as a package out of the box during service bootstrapping.
Uber已经标准化了后端开发流程,5000多个服务由数千名后端工程师构建和管理。JFX是构建在Java Spring Boot服务之上的应用程序框架,而UberFx是为基于G...