Auto-scaling your API Insights and Tips from the Zalando Team

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
相关话题: #zalando
1. Auto-scaling your API Insights and Tips from the Zalando Team Sean Patrick Floyd - @oldJavaGuy JBCNConf 2015 Barcelona, Spain Luis Mineiro - @voidmaze
2. ONE of EUROPE’S LARGEST ONLINE FASHION RETAILERS 15 countries 3 fulfilment centers 15+ million active customers 2.2+ billion € revenue 2014 130+ million visits per month 8.000+ employees Tech hubs in Berlin, Dublin, Dortmund and Helsinki Visit us: tech.zalando.com
3. Our Scale 2 datacenters thousands of production instances serving 15 countries Based on https://flic.kr/p/bX5E4c
4. Zalando stack Credits to our colleague Kolja Wilcke
5. Credits to our colleague Kolja Wilcke
6.
7. Credits to our colleague Kolja Wilcke
8. The Shop Monolith
9. We call it “Jimmy” http://blog.codinghorror.com/new-programming-jargon/
10. Thousands of Java classes, undocumented features Business logic on all layers (including the database) https://flic.kr/p/nBhvfy
11. It’s 2013… Zalando wants to sponsor hackathons… We need a REST API! https://flic.kr/p/tW4sus
12.
13. Hackathon API
14. First Step Let’s create some REST-(ish) Spring controllers inside our monolithic web application Deploy a couple of Jimmy instances just to serve requests for this “API”
15. Pros The infrastructure is already there! A few days of coding and we’re all set
16. Cons This “API” cannot be deployed independently of Jimmy. Jimmy is infested with business logic everywhere.
17. New Requirements
18.
19. New Requirements We needed this new API for our existing frontends, also (very high traffic). Some third-party apps also wanted to use this API as their backend.
20. Shop Public API We decided to build a “real” API as a separate standalone project. Of course, there were some dependencies…
21. Data Sources 1.Catalog - SOLr 2.Stock - Memcached 3.Reviews - REST(-ish) API Shop Public API 4.Recommendations - RPC 5.Database - PostgreSQL Catalog Stock Reviews Recommendations Database
22. All of them in the same data centers. This should be no problem…
23. Tech Company
24. Paradigm Shift Until 2014 We’re a Fashion Company that has a lot of tech knowledge 2015 Suddenly we’re a Tech Company, providing “Fashion as a Service”
25. We established 5 principles
26. API FIRST
27. API First Document and peer review your API in a format like Swagger before writing a single line of code. Ideally, either generate either your server interfaces or your test data (or both) from the Swagger spec
28. REST
29. REST Manipulate resources, don't call methods Expose nouns, not verbs Use HTTP verbs GET, PUT, POST, DELETE, PATCH
30. SAAS
31. Identity Management Our backend services don’t expose APIs yet Company-wide IAM strategy is not ready yet Can only expose read-only features for now
32. MICRO- SERVICES
33. Microservices We already have a Service-Oriented Architecture It was mostly SOAP, though… And definitely not micro
34. CLOUD
35. Road to AWS Some challenges ahead… 1. Backend services not available yet 2. SOLr also not available 3. We can’t just move the databases there
36. How we did it
37. Challenges Catalog and Stock datasources are latency critical and they’re owned by different teams!
38. Can we solve that?
39. Step 1 - move critical datasources to AWS
40. “Move” Data Sources to AWS We can’t just move them! Jimmy also needs them in the data centers, you insensitive clod! Ok… we just replicate them.
41. and then…?
42. Replicating Data Sources SOLr has its own replication mechanism over HTTP memcached should be easy …
43. You wish!
44. Step 2 - Build memcached replication
45. Stock Relay us-east-1 Datacenter SQS Queue Memcached #1 Stock Repeater(s) Memcached ElasticCache Shop Public API GET eu-west-1 Memcached #2 GET SQS Queue Stock Repeater(s) Memcached ElasticCache Shop Public API GET Memcached #3 eu-central-1 … Stock Relay(s) SET / DELETE SQS Queue SNS Topic Stock Repeater(s) Memcached ElasticCache Shop Public API
46. and then…?
47. AWS SOLr Repeater us-east-1 Region Repeater Datacenter Master SOLr Slaves Shop Public API eu-west-1 AWS Master Repeater SOLr Slaves Shop Public API eu-central-1 Region Repeater SOLr Slaves Shop Public API
48. Autoscaling SOLr and API Slave #1 API Node #1 Slave #2 API Node #2 Slave n API Node n Region repeater Autoscaling Group https://api.zalando.com Autoscaling Group http://www.docstoc.com/docs/109290533/Lucid-Imagination# by Erick Erickson
49. Scaling Limitations Region repeater Slave #3 Slave #3 Slave #3 Slave #1 Slave #3 Slave #11 Slave #2 Slave #18 Slave #3 Slave #34 Slave #3 Slave #3 Slave #3 Slave #59 Slave #3 Slave #3 Slave #3 Slave #3 Slave #3 Slave #3 Slave #3 Slave #123 Slave #3 Slave #42815 Slave #3 Slave #325 Slave #3 Autoscaling Group Edited from https://flic.kr/p/zuBXM
50. Y U no scale!
51. S3 Bucket for Replication us-east-1 SOLr Slaves Datacenter Master Master Repeater eu-west-1 SOLr Slaves Shop Public API Shop Public API eu-central-1 SOLr Slaves Shop Public API
52. I see…
53. “Onion layers” for Replication slave.solr • Start with a single repeater layer - Slave #1 layer0-repeater.solr layer0.solr • Setup a Route53 CNAME repeater Slave #2 that links to it - repeater.solr CNAME repeater Slave n • Entire slave fleet in the ASG is always configured to replicate from repeater.solr Autoscaling Group
54. “Onion layers” for Replication slave.solr CloudWatch Slave #1 layer0-repeater.solr • Setup CloudWatch alarms for Slave #2 relevant metrics in the repeater layer. Give them some slack space. CNAME repeater Slave n • Configure it to send notifications to the onion-layers-topic. Autoscaling Group • Bring up your instance of onion- layers.
55. “Onion layers” for Replication layer1-repeater.solr • onion-layers creates a new ASG layer. Calculates currentLayer = Autoscaling Group currentLayer + 1 • The new layer starts replicating from layer0-repeater.solr onion-layers CloudWatch CNAME repeater slave.solr layer<currentLayer-1>-repeater • Adds a new Route53 recordset layer<currentLayer>-repeater. Autoscaling Group
56. “Onion layers” for Replication layer1-repeater.solr • After replication, the CNAME repeater is updated to link to Autoscaling Group CNAME repeater layer<currentLayer>-repeater. • onion-layers acts on alarms for the connection between current layer layer0-repeater.solr and previous layer. slave.solr • You can still have your own AS alarms for the connection between CloudWatch Autoscaling Group slaves and current layer.
57. Yes. These tools will be open-sourced soon
58. Thank you for listening Check out our blog https://tech.zalando.com Our many open source products https://github.com/zalando The STUPS stack https://stups.io Got more questions? You can reach us on twitter @ZalandoTech We’re hiring! Special thanks to Jessie Dude. No Continuum Transfunctioners were harmed during the production of these slides.

首页 - Wiki
Copyright © 2011-2025 iteam. Current version is 2.142.1. UTC+08:00, 2025-04-04 00:14
浙ICP备14020137号-1 $访客地图$