统一Pinterest的键值存储时的3项创新
Jessica Chan | Engineering Manager, MySQL & Key-Value Storage
Jessica Chan | 工程经理,MySQL和键值存储
Engineers hate migrations. What do engineers hate more than migrations? Data migrations. Especially critical, terabyte-scale, online serving migrations which, if done badly, could bring down the site, enrage customers, or cripple hundreds of critical internal services.
工程师讨厌迁移。比起迁移,工程师们更讨厌什么?数据迁移。特别是关键的、TB级的、在线服务的迁移,如果做得不好,可能会导致网站瘫痪,激怒客户,或使数百个关键的内部服务瘫痪。
So why did the Key-Value Systems Team at Pinterest embark on a two-year realtime migration of all our online key-value serving data to a single unified storage system? Because the cost of not migrating was too high. In 2019, Pinterest had four separate key-value systems owned by different teams with different APIs and featuresets. This resulted in duplicated development effort, high operational overhead and incident counts, and confusion among engineering customers.
那么,为什么Pinterest的键值系统团队开始了为期两年的实时迁移,将我们所有的在线键值服务数据迁移到一个统一的存储系统?因为不迁移的成本太高了。在2019年,Pinterest有四个独立的键值系统,由不同的团队拥有不同的API和功能集。这导致了重复的开发工作,高昂的运营开销和事件数量,以及工程客户之间的混乱。
In unifying all of Pinterest’s 500+ key-value use cases (over 4PB of unique data serving 100Ms of QPS) onto one single interface, not only did we make huge gains in reducing system complexity and lowering operational overhead, we achieved a 40–90% performance improvement by moving to the most efficient storage engine, and we saved the company a significant amount in costs per year by moving to the most optimal replication and versioning architecture.
在将Pinterest的500多个键值用例(超过4PB的独特数据,服务于100Ms的QPS)统一到一个单一的接口上时,我们不仅在减少系统复杂性和降低运营开销方面取得了巨大的收益,而且通过转移到最有效的存储引擎,我们实现了40-90%的性能提升,并通过转移到最优化的复制和版本结构,每年为公司节省大量的成本。
In this blog post, we selected three (out of many more) innovations to dive into that helped us notch all these wins.
在这篇博文中,我们选择了三项(从更多的创新中)深入研究,它们帮助我们取得了所有这些胜利。
Before this effort, Pinterest used to have four key-value storage systems:
在这项工作之前,Pinterest曾经有四个键值存储系统。
- Terrapin: a read-only, batch-load, key-value storage built a...