COPING WITH THE CHALLENGE OF SORTING LARGE PRODUCT CATALOGS
如果无法正常显示,请先停止浏览器的去广告插件。
相关话题:
#zalando
1. COPING WITH THE
CHALLENGE OF
SORTING LARGE
PRODUCT
CATALOGS
ONLINE - SHOP WINDOW
ARRANGEMENT
CAGDAS SENOL
19-06-2019
2. TABLE OF CONTENTS
● Challenges
● Data Driven Sorting
● Three Improvements for faster iteration
● One Open Source Contribution
● Status Quo
● Q&A
2
3. ZALANDO AT A GLANCE
~ 5.4
billion EUR
> 300
million
revenue 2018
3
> 15,500 > 80%
employees in
Europe of visits via
mobile devices
visits
per
month
> 400,000
> 27 product choices
million ~ 2,000 17
brands countries
active customers
4. DISCLAIMER
4
5. Window Dressing
5
6. “ Window dresser
Window dressers arrange displays of goods in shop windows or within
a shop itself. Such displays are themselves known as "window
dressing". They may work for design companies contracted to work for
clients or for department stores, independent retailers, airport or hotel
shops.
“
6
7. DATA DRIVEN SORTING
7
8. CHALLENGES
400k
15
CHALLENGES
4k
ups
17
8
9. 9
10. Fail fast,
Iterate
faster
10
11. ITERATE FAST
DESIGN
A/B TESTS
ANALYSIS
11
12. Three Improvements
● Steering
● Fast Index Updates
● Sorting with functions
12
13. First Improvement:
Sort Steering
13
14. Sort Steering - SQL Analogy
Id Bucket Popularity
sku1 1 0.2332332
sku2 2 0.123233
sku3 1 0.4533
Sku2
SELECT * FROM articles ORDER BY Bucket
Desc, Popularity DESC
Sku3
Sku1
14
15. Sort Steering - SQL Analogy
Id Bucket Popularity Popularity_male
sku1 1 0.2332332 0.4
sku2 2 0.123233 0.6
sku3 1 0.4533 0.1
If category_gender == “men”
SELECT * FROM articles ORDER BY Bucket DESC, Populartiy_male DESC
Else
SELECT * FROM articles ORDER BY Bucket DESC, Popularity DESC
15
16. Pre-Sort Steering Architecture
16
17. Sort Steering Added
17
18. Sort Steering - SQL Analogy
Id Bucket Popularity Popularity_male
sku1 1 0.2332332 0.4
sku2 2 0.123233 0.6
sku3 1 0.4533 0.1
If category_gender == “men”
SELECT * FROM articles ORDER BY Bucket DESC, Populartiy_male DESC
Else
SELECT * FROM articles ORDER BY Bucket DESC, Popularity DESC
18
19. 19
20. 2nd
Improvement:
Decoupled
Data Ingestion
20
21. Indexing - SQL Analogy
Id Price Stock Size Partner Performance Performance_new_formula
sku1 9.99 100 32 false 0.5 0.4
sku1 9.99 100 32 false 0.3 0.6
INSERT INTO articles VALUES(9.99, 100, 32, false, 0.5, 0.4)
INSERT INTO articles VALUES(9.99, 100, 32, false, 0.3, 0.6)
21
22. Intake Architecture
22
23. Indexing - SQL Analogy
Id Price Stock Size Partner Id Performance Performance_new_formula
sku1 9.99 100 32 false sku1 0.5 0.4
sku1 0.3 0.6
JOINS => Elasticsearch
23
Id Price Stock Size Partner Performance Performance_new_formula
sku1 9.99 100 32 false 0.3 0.6
24. Intake Architecture Now
24
25. 3rd Improvement:
Sorting with Functions
25
26. Painless Scripts
26
27. Sorting with Functions - Eliminate Reindexing
Id Price Stock Size Partner clicks sales
sku1 9.99 100 32 false 10000 300
SELECT * FROM articles ORDER BY popularity(sales,clicks)
popularity (sales,
27
clicks) = sales/clicks
28. Sorting with Functions - Personalization
Id Price Stock Size Partner popularity article_features
sku1 9.99 100 32 false 1.2 [9.99, 100, 32]
If known_customer :
SELECT * FROM articles ORDER BY dot_product(article_feature, customer_features)
Else
SELECT * FROM articles ORDER BY popularity
28
29. Sorting with Functions - Fulltext Search
Id Price Stock Size Partner clicks sales
sku1 9.99 100 32 false 10000 300
If fulltext_search :
SELECT * FROM articles ORDER BY f(relevance_score, clicks, sales, customer_features,
article_features)
Else
SELECT * FROM articles ORDER BY g(clicks, bucket, sales, customer_features)
29
30. EXAMPLE SORTING RULES
30
31. Personalization
31
32. Query Relevance
32
33. Inline Popularity Calculation
33
34. Open Source
Contribution
34
35. CODE IN CONFIG ????????
35
36. TGYHT - Thanks God You Have Tests
36
37. Make Painless Script Development Painless
•
•
•
37
Painless Lacks Tooling
Elasticsearch Painless Execute API
https://www.elastic.co/guide/en/elasticsearch/painless/cur
rent/painless-execute-api.html
38. Painless Scripts Development Tool
•
•
38
https://github.com/csenol/plsd
Integrated with CI/CD Pipelines
39. Painless Script Development Environment
39
40. Painless Script Performance Tests
40
41. TESTING PAINLESS SCRIPTS/ CI-CD Integration
41
42. Sum up
● Sort Steering => A/B tests
● Decoupled Indexing => Data
Enrichment
● Sorting With Functions => Faster
Implementation + Personalization
42
43. Notable window
dressers [edit]
●
Giorgio Armani, the
fashion designer once
worked as a window
dresser. [1]
43
44. Cagdas Senol
cagdassenol@gmail.com
45. Q&A
45