Avoiding CPU Throttling in a Containerized Environment

摘要

At Uber, all stateful workloads run on a common containerized platform across a large fleet of hosts. Stateful workloads include MySQL®, Apache Cassandra®, ElasticSearch®, Apache Kafka®, Apache HDFS™, Redis™, Docstore, Schemaless, etc., and in many cases these workloads are co-located on the same physical hosts.

With 65,000 physical hosts, 2.4 million cores, and 200,000 containers, increasing utilization to reduce cost is an important and continuous effort. Until recently efforts were blocked due to CPU throttling, which indicates that not enough resources have been allocated.

It turned out that the issue was how the Linux kernel allocates time for processes to run. In this post we will describe how switching from CPU quotas to cpusets (also known as CPU pinning) allowed us to trade a slight increase in P50 latencies for a significant drop in P99 latencies. This in turn allowed us to reduce fleet-wide core allocation by up to 11% due to less variance in resource requirements.

欢迎在评论区写下你对这篇文章的看法。

评论

ホーム - Wiki
Copyright © 2011-2024 iteam. Current version is 2.134.0. UTC+08:00, 2024-09-28 08:25
浙ICP备14020137号-1 $お客様$