END-TO-END LOAD TESTING AT SCALE
如果无法正常显示,请先停止浏览器的去广告插件。
相关话题:
#zalando
1. END-TO-END
LOAD TESTING
AT SCALE
OLIWIA ZAREMBA
CONTINUOUS TESTING MEETUP
10-09-2019
2. WHY LOAD TESTING?
3. BLACK FRIDAY
3
Source: giphy.com
4. BLACK FRIDAY
● Auto-scaling is not a solution for huge spikes
→ To handle a huge spike of traffic, all services need to
be pre-scaled beforehand
● Scaling into infinity costs infinite $$$
→ The scaling configuration needs to be frugal
4
5. BLACK FRIDAY
Problem statement:
For each service S i find a minimum value of the scaling
parameter k i , at which the service can handle the
expected load L.
5
6. WHY END-TO-END?
7. WHY NOT MAKING IT THIS SIMPLE?
curl --get https://www.zalando.de/
7
8. WHY NOT MAKING IT THIS SIMPLE?
● Real customers perform different actions:
browsing, filtering, checking out, …
● These actions are served by many different
services
8
9. MICROSERVICES
9
Source: https://thenewstack.io/history-service-mesh/
10. END-TO-END APPROACH WITH MICROSERVICES
Definition of the task:
Generate an increasingly high realistic load to identify
bottlenecks in the microservices.
10
11. AT WHAT SCALE?
12. BLACK FRIDAY 2017 NUMBERS
● 2,000 orders per minute (1,500 in 2016)
● 100,000 new customers
12 Source: https://corporate.zalando.com/en/newsroom/en/stories/zalando-celebrates-successful-black-friday-2017
13. BLACK FRIDAY 2017 NUMBERS
● 2,000 orders per minute (1,500 in 2016)
● 100,000 new customers
… and expectations for bigger numbers in 2018
13
14. HOW DO WE WANT TO ACHIEVE IT?
15. SIMULATE REAL USERS
15
16. SIMULATE REAL USERS
16
17. SIMULATE REAL USERS - WITH PUPPETEER?
17
18. SIMULATE REAL USERS - WITH PUPPETEER?
18
19. SIMULATE REAL USERS - WITH PUPPETEER?
$
19
$
20. SIMULATE REAL USERS… AND MAKE IT CHEAP
team
my_browser_session
record
once
20
21. SIMULATE REAL USERS… AND MAKE IT CHEAP
Thin
agent
TRANSFORM
my_browser_session
21
...
load test scenario
Thin
agent
Thin
agent
replay
many
times
22. IMPLEMENTATION
23. RECORDING BY THE TEAM + REPLAYING BY THE LOAD TEST RUNNER
load test runner
session
23
load test scenario
24. RECORDING BY THE TEAM + REPLAYING BY LOCUST
session
24
locustfile.py
25. TRANSFORMER + LOCUST
session.har
25
locustfile.py
26. TRANSFORMER + ZELT + LOCUST
session.har
26
locustfile.py
27. TRANSFORMER + ZELT + LOCUST
session.har
27
locustfile.py
28. TRANSFORMER + ZELT + LOCUST + KUBERNETES
session.har
locustfile.py
+
28
29. CHOOSING THE TECHNOLOGY
29 Source: locust.io
30. CHOOSING THE TECHNOLOGY
30 Source: locust.io
31. CHOOSING THE TECHNOLOGY
31 Source: github.com/locustio/locust
32. 32
33. HOW IT WORKS: 1. RECORDING THE USER BEHAVIOUR
● HAR - HTTP ARchive
● File extension: .har
● Format: JSON
33
34. HOW IT WORKS: 1. RECORDING THE USER BEHAVIOUR
34
35. HOW IT WORKS: 2. TRANSFORMING HAR INTO LOCUSTFILE
my_browser_session.har
35
locustfile.py
36. HOW IT WORKS: 2. TRANSFORMING SCENARIOS WITH WEIGHTS
browsing_items_scenario.har
83%
checkout_scenario.har
17%
36
37. HOW IT WORKS: 2. TRANSFORMING SCENARIOS WITH WEIGHTS
browsing_items_scenario.har
locustfile.py
checkout_scenario.har
37
38. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Input: # RPS
for each
major service
Plan and record
the scenarios
38
Output: HAR files
with the scenarios
recorded
39. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Announce the
load test
Plan and record
the scenarios
39
40. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Announce the
load test
Plan and record
the scenarios
40
Execute the test
increasing the load
41. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Announce the
load test
Plan and record
the scenarios
41
Execute the test
increasing the load
Identify the first
component to go down
42. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Announce the
load test
Plan and record
the scenarios
Execute the test
increasing the load
Identify the first
component to go down
Wait some time until the
issue is addressed
42
43. HOW IT WORKS: 3. EXECUTING THE END-TO-END LOAD TESTS
Announce the
load test
Plan and record
the scenarios
Identify the first
component to go down
Share the journal & the
next load test date
43
Execute the test
increasing the load
Wait some time until the
issue is addressed
44. SOME OBSTACLES DOWN THE ROAD
45. OBSTACLE 1: SECURITY SYSTEM BLOCKED US
45
46. OBSTACLE 1: SECURITY SYSTEM BLOCKED US
● End-to-end load test is in reality a well-intended
DoS attack
46
47. OBSTACLE 1: SECURITY SYSTEM BLOCKED US
● Solution: mark all requests coming from Zelt easily
identifiable by the security system
● Analytics, machine learning models, A/B tests need
to filter out Zelt traffic too!
47
48. OBSTACLE 2: COOKIES RECORDED IN THE HAR FILE ARE NOT VALID WHEN REPLAYING
● Solution: don’t process the cookies as recorded.
Instead, let the cookies be set by response headers
in the replay mode
48
49. OBSTACLE 3: WE CAN’T KEEP USING THE SAME TEST CUSTOMER ACCOUNT
● Solution: override the customer credentials in the
registration/login step with test accounts
● Parameterize the scenarios: for each execution,
choose a random account from a defined set
49
50. OBSTACLE 4: WE ONLY WANT TO TARGET ZALANDO, NOT GOOGLE ANALYTICS ENDPOINTS
● Solution: provide a blacklisting mechanisms for
automatic filtering of the recorded requests
50
51. OBSTACLE 5: MORE AND MORE ZALANDO-SPECIFIC MECHANISMS NEED TO BE ADDRESSED
● Solution: introduce a system of plugins for
Transformer
● Implement each Zalando-specific solution
as a plugin
51
52. HOW DID WE DO?
53. FINAL CONFIGURATION
53
5 300 130,000
STACKS LOCUST WORKERS
PER STACK RPS
54. OFFICIAL RESULTS OF THE
BLACK FRIDAY CAMPAIGN
Source: corporate.zalando.com
54
55. ONE MORE THING...
github.com/zalando-incubator/transformer
55
github.com/zalando-incubator/zelt
56. OLIWIA ZAREMBA
SOFTWARE ENGINEER
oliwia.zaremba@zalando.de
twitter.com/tortilato