用pandas这个可扩展的键值存储为我们的元数据堆栈提供未来保障

Metadata is crucial for serving user requests. It also takes up a lot of space—and as we’ve grown, so has the amount of metadata we’ve had to store. This isn’t a bad problem to have, but we knew it was only a matter of time before our metadata stack would need an overhaul.

元数据对于服务用户请求至关重要。它也占用了大量的空间--随着我们的成长,我们不得不存储的元数据的数量也在增加。这并不是一个糟糕的问题,但我们知道,我们的元数据堆栈需要大修只是时间问题。

Dropbox operates two large-scale metadata storage systems powered by sharded MySQL. One is the Filesystem which contains metadata related to files and folders. The other is Edgestore, which powers all other internal and external Dropbox services. Both operate at a massive scale. They run on thousands of servers, store petabytes of data on SSDs, and serve tens of millions of queries per second with single-digit millisecond latency.

Dropbox运营两个大规模的元数据存储系统,由分片的MySQL提供支持。一个是文件系统,包含与文件和文件夹有关的元数据。另一个是Edgestore,它为所有其他内部和外部Dropbox服务提供动力。两者都以巨大的规模运作。它们在数千台服务器上运行,在SSD上存储PB级的数据,并以个位数毫秒的延迟提供每秒数千万次的查询。

A few years ago, however, we realized we would soon need a more cost-effective and performant way to keep up with Edgestore’s growth. Edgestore was originally built directly on sharded MySQL, with each of our storage servers holding multiple database shards. Data within Edgestore was evenly distributed, so that most of the disks reached capacity around the same time. When it came time to expand, we split each machine in two, with each holding half of the shards. But by 2019, with another split of Edgestore looming, it was clear this strategy would come with a significant cost. We also faced a looming crisis: that an outlier shard could outgrow the capacity of a single machine, and we would have no way to split that shard.

然而,几年前,我们意识到我们很快就会需要一种更具成本效益和性能的方式来跟上Edgestore的发展。Edgestore最初是直接建立在分片的MySQL上,我们的每台存储服务器都持有多个数据库分片。Edgestore中的数据是均匀分布的,因此大多数磁盘在同一时间达到了容量。当需要扩展时,我们将每台机器一分为二,每台机器持有一半的分片。但是到了2019年,随着Edgestore的再次拆分迫在眉睫,很明显这种策略会带来巨大的成本。我们还面临着一个迫在眉睫的危机:一个离群索居的碎片可能会超出一台机器的容量,而我们将没有办法分割这个碎片。

We decided it was time to rethink the storage layer for our metadata stack. Critical...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.125.0. UTC+08:00, 2024-05-03 20:59
浙ICP备14020137号-1 $访客地图$