2-22-22发生的斯拉克事件

By Laura Nolan, with contributions from Glen D. Sanford, Jamie Scheinblum, and Chris Sullivan.

作者:Laura Nolan,Glen D. Sanford、Jamie Scheinblum和Chris Sullivan提供资料。

Assessing conditions

评估条件

Slack experienced a major incident on February 22 this year, during which time many users were unable to connect to Slack, including the author — which certainly made my role as Incident Commander more challenging!

今年2月22日,Slack经历了一次重大事件,在此期间,许多用户无法连接到Slack,包括笔者在内--这无疑使我作为事件指挥官的角色变得更加具有挑战性。

This incident was a textbook example of complex systems failure: it had a number of contributing factors and part of the incident involved a cascading failure scenario.

这一事件是复杂系统故障的一个教科书式的例子:它有许多促成因素,而且部分事件涉及到一个级联的故障情况。

Just after 6 a.m. Pacific Time, a number of things happened almost simultaneously: we began to receive user tickets about problems connecting to Slack, some internal users experienced problems using Slack, and a number of our engineering teams received pages or alerts about problems.

太平洋时间上午6点刚过,几乎同时发生了一些事情:我们开始收到关于连接Slack问题的用户票据,一些内部用户在使用Slack时遇到问题,我们的一些工程团队也收到了关于问题的页面或警报。

When a user begins a new Slack session (after restarting their client or being disconnected from the network for some time), the Slack client performs a process called booting, which is described in Mark Christian’s blog post Getting to Slack faster with incremental boot. During the client boot process, data such as channel listings, user and team preferences, and most recent conversations is fetched from Slack’s servers and cached on the client. Slack isn’t usable until your client is booted, and if Slack can’t boot, you get an error page.

当用户开始一个新的Slack会话时(在重启他们的客户端或与网络断开一段时间后),Slack客户端会执行一个称为启动的过程,这在Mark Christian的博文《通过增量启动更快地进入Slack》中有所描述。在客户端启动过程中,诸如频道列表、用户和团队偏好以及最近的对话等数据会从Slack的服务器上获取并缓存在客户端。在你的客户端启动之前,Slack是不能使用的,如果Slack不能启动,你会得到一个错误页面。

Assessing conditions at the beginning of the incident, we found that we were seeing more load than usual on parts of our database system. Slack stores its data in Vitess, a horizontal scaling system for MySQL (see Scali...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-22 19:16
浙ICP备14020137号-1 $访客地图$