调查跨区域网络性能问题

Hechao Li, Roger Cruz

Hechao Li, Roger Cruz

Cloud Networking Topology

云网络拓扑

Netflix operates a highly efficient cloud computing infrastructure that supports a wide array of applications essential for our SVOD (Subscription Video on Demand), live streaming and gaming services. Utilizing Amazon AWS, our infrastructure is hosted across multiple geographic regions worldwide. This global distribution allows our applications to deliver content more effectively by serving traffic closer to our customers. Like any distributed system, our applications occasionally require data synchronization between regions to maintain seamless service delivery.

Netflix拥有一个高效的云计算基础设施,支持我们的SVOD(订阅视频点播)、直播和游戏服务的各种关键应用程序。利用亚马逊AWS,我们的基础设施分布在全球多个地理区域。这种全球分布使我们的应用程序能够通过更靠近客户的方式更有效地提供内容。与任何分布式系统一样,我们的应用程序偶尔需要在区域之间进行数据同步,以保持无缝的服务交付。

The following diagram shows a simplified cloud network topology for cross-region traffic.

下图显示了一个简化的云网络拓扑,用于跨区域流量。

The Problem At First Glance

一开始的问题

Our Cloud Network Engineering on-call team received a request to address a network issue affecting an application with cross-region traffic. Initially, it appeared that the application was experiencing timeouts, likely due to suboptimal network performance. As we all know, the longer the network path, the more devices the packets traverse, increasing the likelihood of issues. For this incident, the client application is located in an internal subnet in the US region while the server application is located in an external subnet in a European region. Therefore, it is natural to blame the network since packets need to travel long distances through the internet.

我们的云网络工程师接到了一个关于影响跨区域流量的应用程序的网络问题的请求。最初,似乎应用程序遇到了超时问题,可能是由于网络性能不佳。众所周知,网络路径越长,数据包经过的设备越多,问题发生的可能性就越大。对于这个事件,客户端应用程序位于美国地区的内部子网,而服务器应用程序位于欧洲地区的外部子网。因此,怀疑网络是很自然的,因为数据包需要通过互联网进行长距离传输。

As network engineers, our initial reaction when the network is blamed is typically, “No, it can’t be the network,” and our task is to prove it. Given that there were no recent changes to the network infrastructure and no reported AWS issues...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.132.0. UTC+08:00, 2024-09-19 08:18
浙ICP备14020137号-1 $访客地图$