使用TCP拥塞控制算法增强分布式系统负载削减

A busy port where hundreds of containers wait to be loaded to ships or trailers, photo by CHUTTERSNAP on Unsplash

Introduction

介绍

Our team is responsible for sending out communications to all our customers at Zalando - e.g. confirming a placed order, informing about new content from a favourite brand or announcing sales campaigns. During the preparation of those messages as well during sending those out via different service providers we have to deal with limited resources. We cannot process all requested communication as fast as possible. This leads occasionally to some backlog of requests.

我们团队负责向Zalando的所有客户发送通信,例如确认已下单,通知来自喜爱品牌的新内容或宣布销售活动。在准备这些消息以及通过不同的服务提供商发送这些消息时,我们必须处理有限的资源。我们无法尽快处理所有请求的通信。这偶尔会导致一些请求的积压。

But not all communication is equally important. The business stakeholders have requested to ensure that we process the communication which supports critical business operations within the given service level objectives (SLOs).

但并非所有的沟通都同等重要。业务利益相关者要求确保我们处理支持关键业务运营的沟通,以满足给定的服务水平目标(SLOs)。

This has led us to investigate the space of solutions for load shedding. Load shedding has been addressed in Skipper already. But our system is event driven, all requests we process are delivered as events via Nakadi. Skipper's feature does not help here. But why not use the same underlying idea?

这导致我们研究了负载分担的解决方案空间。负载分担已经在Skipper中得到了解决。但我们的系统是事件驱动的,我们处理的所有请求都通过Nakadi以事件的形式传递。Skipper的功能在这里没有帮助。但为什么不使用相同的基本思想呢?

We know if our system runs within its normal limits that we meet our SLOs. If we would control the ingestion of message requests into our system we would be able to process the task in a timely manner. Additionally we would need to combine this control of ingestion with prioritization of those requests which support critical business operations.

我们知道,如果我们的系统在其正常限制范围内运行,我们将满足我们的SLO。如果我们能够控制将消息请求引入系统的过程,我们将能够及时处理任务。此外,我们还需要将这种引入控制与支持关键业务操作的请求优先级结合起来。

Overview of the System

系统概述

First, let me introduce you to the system under the load.

首先,让我向您介绍一下负载下的系统。

Communication Platform Overview

Communication Platform Overview

通信平台概述

Nakadi is a distributed event bus that offers a RESTful API on top of Kafka-like queues. This component serves a couple of thousands of event typ...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.125.0. UTC+08:00, 2024-05-07 15:39
浙ICP备14020137号-1 $访客地图$