提高 MySQL® 集群的正常运行时间:在规模上实现 MGR 的可行性

This is the second blog in a two-part series that describes how Uber adopted MySQL® Group Replication to improve MySQL cluster uptime. In the first part, we explored the architectural shift that took place within Uber’s MySQL infrastructure—from a reactive, externally-driven failover system to an internal consensus-based architecture powered by MGR (MySQL Group Replication).

这是一个两部分系列博客的第二篇,描述了Uber如何采用MySQL® Group Replication来提高MySQL集群的正常运行时间。在第一部分中,我们探讨了Uber的MySQL基础设施内发生的架构转变——从一个反应式、外部驱动的故障转移系统转变为一个基于内部共识的架构,由MGR(MySQL Group Replication)驱动。

We introduced the concept of a three-node consensus group running in single-primary mode, explained the advantages of Paxos-based elections, and discussed how this setup addresses the core reliability challenges we previously faced.

我们引入了一个在单主模式下运行的三节点共识组的概念,解释了基于Paxos的选举的优势,并讨论了这一设置如何解决我们之前面临的核心可靠性挑战。

Image

Figure 1: Architecture of a consensus-based MySQL cluster.

图1:基于共识的MySQL集群架构。

In this part, we take this story further: How did we build this system in a scalable, operationally sound way? How do we safely onboard and offboard clusters? What happens when a node fails or is replaced?

在这一部分,我们进一步探讨这个故事:我们是如何以可扩展、操作上合理的方式构建这个系统的?我们如何安全地加入和移除集群?当节点失败或被替换时会发生什么?

Let’s dive into the implementation, automation, failover logic, and benchmarking that made MGR viable at Uber’s scale.

让我们深入探讨使MGR在Uber规模上可行的实施、自动化、故障转移逻辑和基准测试。

To manage these clusters, we developed a control plane that orchestrates all the key operations. These processes are fully automated, allowing us to run a complex, large-scale environment with minimal manual intervention.

为了管理这些集群,我们开发了一个控制平面,协调所有关键操作。这些过程是完全自动化的,使我们能够以最小的人工干预运行复杂的大规模环境。

Moving an existing MySQL cluster to the new highly-available setup is a multi-step process. We automated these steps into a workflow, making the onboarding a one-click workflow that follows the standard MGR setup guide.

将现有的MySQL集群迁移到新的高可用设置是一个多步骤的过程。我们将这些步骤自动化为一个工作流,使得加入过程成为一个一键式工作流,遵循标准的MGR设置指南。

  1. Our automated control plane selects a single, healthy node to become the bootstrap node ...
开通本站会员,查看完整译文。

trang chủ - Wiki
Copyright © 2011-2025 iteam. Current version is 2.148.2. UTC+08:00, 2025-12-12 16:11
浙ICP备14020137号-1 $bản đồ khách truy cập$