数据库联邦:去中心化和符合 ACL 的 Hive™ 数据库

One of Uber’s data warehouses powering the Delivery business outgrew its original design. More than 16,000 Hive datasets and 10 petabytes from multiple business domains lived inside a single, monolithic database—owned and operated by a centralized delivery Data Solutions team. While this one-big-bucket setup once simplified onboarding and discovery, scale and organizational growth turned the same design into a liability. The monolithic design had many limitations. 

Uber 的一座为 Delivery 业务提供动力的数据仓库超出了其原始设计。超过 16,000 个 Hive 数据集和来自多个业务领域的 10 PB 数据居住在一个单一的单体数据库中——由中央化的 delivery Data Solutions 团队拥有和运营。虽然这种 one-big-bucket 设置曾经简化了 onboarding 和 discovery,但规模和组织增长使同样的设计成为了一种负担。单体设计有许多限制。 

Metadata corruption or resource spikes initiated by one team could cascade across the entire database, disrupting unrelated tier-1 workloads and critical business use cases. 

由一个团队引发的元数据损坏或资源峰值可能会在整个数据库中级联,破坏不相关的 tier-1 工作负载和关键业务用例。 

Resource Contention and Noisy Neighbors 

资源争用和 Noisy Neighbors 

Unbounded, ad-hoc datasets and uneven dataset-count growth also competed for the same Metastore, Apache HDFS™, and compute quotas—degrading query latency for everyone.

无限制的、即兴数据集和不均匀的数据集数量增长也争夺相同的 Metastore、Apache HDFS™ 和 compute quotas—从而降低每个人的查询延迟。

Any database-level task (ACL updates, DDL fixes, TTL enforcement, lineage audits, incident response) had to flow through the central Data Solutions team, slowing mitigation and burdening a single on-call surface. 

任何数据库级任务(ACL 更新、DDL 修复、TTL 执行、lineage audits、incident response)都必须通过中央 Data Solutions 团队流动,从而减缓 mitigation 并负担单一的 on-call surface。 

With thousands of heterogeneous datasets in a single namespace, it was hard to apply domain-specific data quality rules, track ownership, and set meaningful alerting thresholds.

在单一命名空间中拥有数千个异构数据集,很难应用特定领域的数据质量规则、跟踪所有权并设置有意义的警报阈值。

The monolithic database also granted broad read/write permissions to most teams and services, violating least-privilege principles and amplifying the blast radius of accidental or malicious changes.

这个单体数据库还授予大多数团队和服务广泛的读/写权限,违反了 least-pr...

开通本站会员,查看完整译文。

Home - Wiki
Copyright © 2011-2026 iteam. Current version is 2.154.0. UTC+08:00, 2026-02-21 18:47
浙ICP备14020137号-1 $Map of visitor$