大规模迁移:在不宕机的情况下将营销云缓存从 Memcached 移动到 Redis,达到 1.5M RPS

By Paladi Sandhya Madhuri, Rakesh Chhabra, Piyush Pruthi, Sumit Sahrawat, Ankit Jain, and Basaveshwar Hiremath.

作者:Paladi Sandhya Madhuri, Rakesh Chhabra, Piyush Pruthi, Sumit Sahrawat, Ankit Jain, 和 Basaveshwar Hiremath。

In our Engineering Energizers Q&A series, we highlight the engineering minds driving innovation across Salesforce. Today’s edition features Paladi Sandhya Madhuri, a Senior Software Engineer on the Marketing Cloud Caching team, whose work involves evolving the platform’s core caching infrastructure to support high-volume, latency-sensitive workloads, including a live migration handling approximately 1.5 million cache events per second across over 50 applications.

在我们的工程激励者问答系列中,我们突出了推动Salesforce创新的工程人才。今天的版本介绍了Paladi Sandhya Madhuri,她是营销云缓存团队的高级软件工程师,她的工作涉及发展平台的核心缓存基础设施,以支持高流量、对延迟敏感的工作负载,包括处理每秒约150万缓存事件的实时迁移,涉及超过50个应用程序。

Explore how the team executed a zero-downtime migration under live production traffic, preserving application behavior while changing the underlying cache engine, managing hot-key pressure from Redis at scale, and validating stable performance and reliability by sustaining end-to-end P50 latency near 1 millisecond and P99 latency around 20 milliseconds throughout the transition.

探索团队如何在实时生产流量下执行零停机迁移,保持应用程序行为的同时更改底层缓存引擎,管理来自Redis的热键压力,并通过在整个过渡过程中保持接近1毫秒的端到端P50延迟和约20毫秒的P99延迟来验证稳定的性能和可靠性。

The team embarked on a mission to modernize the Marketing Cloud’s core caching layer. Their goal was to maintain availability, security, and performance at scale, which was a significant undertaking.

团队开始了一项任务,旨在现代化Marketing Cloud的核心缓存层。他们的目标是在规模上保持可用性、安全性和性能,这是一项重大任务。

The existing Memcached-based system, while supporting a vast application ecosystem, presented challenges with its lack of native replication and built-in user authentication, making it difficult to meet evolving uptime, security, and maintenance demands. The absence of replication meant that a failure of a Memcached node required rebuilding the entire cache, leading to increased latency and significant load on the underlying databas...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2026 iteam. Current version is 2.148.3. UTC+08:00, 2026-01-09 15:53
浙ICP备14020137号-1 $访客地图$