Architecting backends to serve millions of RPS

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
相关话题: #zalando
1. Architecting backends to serve millions of RPS Conor Gallagher
2. What problem is being solved and why?
3. Problem Statement Zalando adopted streaming architectures in the mid 2010s Pushing Product data into an event bus did not make it easy to interact with for our various business units Would it be possible to serve Product data centrally via API and make it so performant that the distributed Product datastores become redundant? 3
4. Requirements New Product Read API will be a tier one service serving Product data to Zalando’s Fashion Stores and internal systems across Europe. 4 ● Low Latency < 50ms p99 per single-get ● High Throughput: Millions requests per second ● High Availability ● Support for Batch Retrieval ● Handle Hot Products
5. Hot Product
6. Product Read API (PRAPI)
7.
8. Single GET Performance 8
9. Batch GET Performance 9
10. How do we achieve this performance?
11. Load Balancing
12. Hash some part of the incoming request to determine its location on the ring. LB will always send traffic to the closest POD to the right on the ring.
13. Consistent Hash Load Balancing for Products Use the product-id from the request URL as input to the LB strategy to consistently route product requests to particular pods. As Product Catalogs are only ever partially hot, a small bounded cache on each pod with a short TTL would have a huge impact. By hot, we mean popular products, think of your basic white t-shirts or new Nike shoes under campaign. 13
14. Extract the product-id from the path of the incoming request Hash the product-id to determine its location on the ring and the POD that will get the traffic
15. Skipper is configured to round-robin batch requests across a dedicated Batch deployment. This deployment makes N parallel consistently-routed requests to the Single Get deployment
16. Consistent Hash Load Balancing for Products Scaling Activities should not cause Mass Cache Invalidation https://github.com/zalando/skipper/issues/1712 ● Enter each pod at 100 random locations on the ring. ● Results in 1/N cache invalidations, where N is the total number of pods Avoid Overloading a single POD. Requests should spill-over consistently into neighbouring PODs: https://github.com/zalando/skipper/issues/1769 ● Introduce Bounded Load, with configurable loading factor ● A single pod can only ever serve N (1.5) times more requests than the average 16
17. Async vs Non Blocking What’s the difference?
18. Non Blocking IO Product-Sets -> Products API (NIO JAX-RS client) DynamoDB Client (Async NIO using Netty)
19. NIO - Resource Utilisation Under Load 10,000 Outbound requests per second: ● 4 CPU cores request limit ● 16 Active Threads ● 20% CPU Utilization
20. Caffeine Async Loading Cache https://github.com/ben-manes/caffeine
21.
22. Garbage Collection (GC) Tuning
23. GC Tuning - Before
24. GC Tuning - Fix
25. GC Tuning - After
26. Questions?

首页 - Wiki
Copyright © 2011-2025 iteam. Current version is 2.142.1. UTC+08:00, 2025-04-04 00:07
浙ICP备14020137号-1 $访客地图$