Uber如何使用集成缓存从在线存储中每秒提供超过4000万次读取

Docstore is Uber’s in-house, distributed database built on top of MySQL®. Storing tens of PBs of data and serving tens of millions of requests/second, it is one of the largest database engines at Uber used by microservices from all business verticals. Since its inception in 2020, Docstore users and use cases are growing, and so are the request volume and data footprint.

Docstore 是 Uber 自家的分布式数据库，建立在 MySQL® 之上。它存储着数十 PB 的数据，并每秒提供数千万次请求，是 Uber 中最大的数据库引擎之一，被所有业务垂直领域的微服务使用。自 2020 年问世以来，Docstore 的用户和用例不断增长，请求量和数据占用空间也在增加。

The growing number of demands from business verticals and offerings introduces complex microservices and dependency call graphs. As a result, applications demand low latency, higher performance, and scalability from the database, while simultaneously generating higher workloads.

来自业务垂直领域和产品的不断增长的需求引入了复杂的微服务和依赖调用图。因此，应用程序要求数据库具有低延迟、更高性能和可扩展性，同时产生更高的工作负载。

Most of the microservices at Uber use databases backed by disk-based storage in order to persist data. However, every database faces challenges serving applications that require low-latency read access and high scalability.

Uber的大多数微服务使用基于磁盘存储的数据库来持久化数据。然而，每个数据库都面临着为需要低延迟读取访问和高可扩展性的应用程序提供服务的挑战。

This came to a boiling point when one use case required much higher read throughput than any of our existing users. Docstore could have accommodated their needs, as it is backed by NVMe SSDs, which provide low latency and high throughput. However, using Docstore in the above scenario would have been cost prohibitive and would have required many scaling and operational challenges.

当一个用例需要比我们现有用户的读取吞吐量更高时，这个问题变得尤为突出。虽然 Docstore 可以满足他们的需求，因为它由 NVMe SSD 支持，提供低延迟和高吞吐量。然而，在上述场景中使用 Docstore 将会成本过高，并且需要解决许多扩展和运维挑战。

Before diving into the challenges, let’s understand the high-level architecture of Docstore.

在深入了解挑战之前，让我们先了解 Docstore 的高级架构。

Docstore is mainly divided into three layers: a stateless query engine layer, a stateful storage engine layer, and a control plane. For the scope of this blog, we will talk about its query and storage engine layers.

文档存储主要分为三个层次：无状态查询引...