提高Pinterest的分布式缓存性能和效率
Kevin Lin | Software Engineer, Storage and Caching
Kevin Lin | 软件工程师,存储和缓存
Introduction
简介
Pinterest’s distributed caching system, built on top of open source technologies memcached and mcrouter, is a critical component of the production infrastructure stack. Pinterest’s cache-as-a-service platform is responsible for driving down application latency across the board, reducing the overall cloud cost footprint, and ensuring adherence to strict sitewide availability targets.
Pinterest的分布式缓存系统建立在开源技术memcached和mcrouter之上,是生产基础设施堆栈的一个重要组成部分。Pinterest的缓存即服务平台负责全面降低应用程序的延迟,减少整体的云计算成本,并确保遵守严格的全站可用性目标。
Today, Pinterest’s memcached fleet spans over 5000 EC2 instances across a variety of instance types optimized along compute, memory, and storage dimensions. Collectively, the fleet serves up to ~180 million requests per second and ~220 GB/s of network throughput over a ~460 TB active in-memory and on-disk dataset, partitioned among ~70 distinct clusters.
今天,Pinterest的memcached团队跨越了5000多个EC2实例,这些实例类型在计算、内存和存储方面都进行了优化。总的来说,该机群为每秒1.8亿个请求和220GB/s的网络吞吐量提供服务,涉及460TB的活跃内存和磁盘数据集,被划分在70个不同的集群中。
As a core driver of reduced sitewide latency, the distributed caching tier is subject to stringent performance and latency requirements. Additionally, a key consequence of the sheer size of the fleet is that even small efficiency optimizations have an outsized impact on the total service cost footprint. Several years of operational experience running memcached at scale in production have provided unique insight into practical optimizations for driving improved performance and efficiency across the entire caching stack.
作为减少网站延迟的核心驱动力,分布式缓存层要满足严格的性能和延迟要求。此外,车队规模庞大的一个重要后果是,即使是小的效率优化也会对总的服务成本产生巨大的影响。几年来在生产中大规模运行memcached的操作经验,为推动整个缓存堆栈的性能和效率的提高提供了独特的洞察力。
In this article, we will share some context on the observability and performance testing tools that enable optimization exploration work, followed by a deep dive into practical optimizations currently running in our production environment along dimensions of hardware selection strategy, compute ef...