有效管理Uber大数据平台的供需关系

Efficiently Managing the Supply and Demand on Uber’s Big Data Platform

With Uber’s business growth and the fast adoption of big data and AI, Big Data scaled to become our most costly infrastructure platform. To reduce operational expenses, we developed a holistic framework with 3 pillars: platform efficiency, supply, and demand (using supply to describe the hardware resources that are made available to run big data storage and compute workload, and demand to describe those workloads).  In this post, we will share our work on managing supply and demand.  For more details about the context of the larger initiative and improvements in platform efficiency, please refer to our earlier posts: Challenges and Opportunities to Dramatically Reduce the Cost of Uber’s Big Data, and Cost-Efficient Open Source Big Data Platform at Uber.

随着Uber的业务增长以及大数据和人工智能的快速采用,大数据规模化成为我们成本最高的基础设施平台。为了减少运营费用,我们开发了一个整体框架,其中有3个支柱:平台效率、供应和需求(用供应来描述可用于运行大数据存储和计算工作负载的硬件资源,用需求来描述这些工作负载)。 在这篇文章中,我们将分享我们在管理供应和需求方面的工作。 关于更大的倡议的背景和平台效率的改进的更多细节,请参考我们以前的帖子。 大幅降低Uber大数据成本的挑战和机遇,和 Uber的成本效益型开源大数据平台.

Supply

供应

Given that the vast majority of Uber’s infrastructure is on-prem, we will start with some of the open technologies that we applied onsite.

鉴于Uber的绝大部分基础设施是在内部的,我们将从现场应用的一些开放技术开始。

Cheap and Big HDDs

廉价和大的硬盘

While the focus of the storage market has moved from HDD to SSD considerably over the last 5 years, HDD still has a better capacity/IOPS ratio that suits the necessary workload for big data. One of the reasons is that most big data workloads are sequential scans instead of random seeks. Still, the conventional wisdom is that bigger HDDs with less IOPS/TB can negatively affect the performance of big data applications.

虽然在过去的5年中,存储市场的重点已经从HDD大幅转移到SSD,但HDD仍然具有更好的容量/IOPS比率,适合大数据的必要工作负载。其中一个原因是,大多数大数据工作负载是顺序扫描,而不是随机搜索。然而,传统的观点是,更大的HDD,更少的IOPS/TB会对大数据应用的性能产生负面影响。

Our HDFS clusters at Uber have many thousands of machines, each with dozens of HDDs. At first we believed that the IOPS/TB would be an unavoidable problem, but our investigation showed that it can be mitigated.

我们在Uber的HDFS集群有好几千台机器,每台机器有几十块硬盘。起初我们认为IOPS/TB会是一个不可避免...

开通本站会员,查看完整译文。

ホーム - Wiki
Copyright © 2011-2024 iteam. Current version is 2.129.0. UTC+08:00, 2024-07-02 01:25
浙ICP备14020137号-1 $お客様$