在云中调试系统。MySQL、Kubernetes和Cgroups

By Rodrigo Saito, Akshay Suryawanshi, and Jeremy Cole

作者:Rodrigo Saito, Akshay Suryawanshi, and Jeremy Cole

KateSQL is Shopify’s custom-built Database-as-a-Service platform, running on top of Google Cloud’s Kubernetes Engine (GKE), currently manages several hundred production MySQL instances across different Google Cloud regions and many GKE Clusters.

KateSQL是Shopify定制的数据库即服务平台,运行在谷歌云的Kubernetes引擎(GKE)之上,目前在不同的谷歌云区域和许多GKE集群中管理着数百个生产型MySQL实例。

Earlier this year, we found a performance related issue with KateSQL: some Kubernetes Pods running MySQL would start up and shut down slower than other similar Pods with the same data set. This partially impaired our ability to replace MySQL instances quickly when executing maintenance tasks like config changes or upgrades. While investigating, we found several factors that could be contributing to this slowness.

今年早些时候,我们发现了一个与KateSQL有关的性能问题:一些运行MySQL的Kubernetes Pod会比其他具有相同数据集的类似Pod启动和关闭得慢。这部分削弱了我们在执行配置变更或升级等维护任务时快速更换MySQL实例的能力。在调查中,我们发现有几个因素可能导致这种缓慢。

The root cause was a bug in the Linux kernel memory cgroup controller.  This post provides an overview of how we investigated the root cause and leveraged Shopify’s partnership with Google Cloud Platform to help us mitigate it.

其根本原因是Linux内核内存cgroup 控制器中的一个错误。 这篇文章概述了我们如何调查根本原因并利用Shopify与谷歌云平台的合作关系来帮助我们缓解这一问题。

The Problem

问题所在

KateSQL has an operational procedure called instance replacement. It involves creating a new MySQL replica (a Kubernetes Pod running MySQL) and then stopping the old one, repeating this process until all the running MySQL instances in the KateSQL Platform are replaced. KateSQL’s instance replacement operations revealed inconsistent MySQL Pod creation times, ranging from 10 to 30 minutes. The Pod creation time includes the time needed to:

KateSQL有一个叫做实例替换的操作程序。它涉及创建一个新的MySQL副本(运行MySQL的Kubernetes Pod),然后停止旧的副本,重复这个过程,直到KateSQL平台中所有运行的MySQL实例被替换。KateSQL的实例替换操作揭示了不一致的MySQL Pod创建时间,从10到30分钟不等。Pod的创建时间包括以下所需的时间。

  • spin up a new GKE node (if needed)
  • 启动一个新的GKE节点(如果需要)。
  • create a new Persistent Disk with data from the ...
开通本站会员,查看完整译文。

ホーム - Wiki
Copyright © 2011-2024 iteam. Current version is 2.129.0. UTC+08:00, 2024-07-06 13:32
浙ICP备14020137号-1 $お客様$