Unified Feature Store
如果无法正常显示,请先停止浏览器的去广告插件。
1. Unified Feature Store
@ eBay
Yucai Yu
eBay AI Architect
2.
3. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases
4. AI Project Life Cycle
Applied machine learning is basically feature engineering
- Andrew Ng
5. Major Pain Points in Feature
Long TTM (Time-to-Market)
- Lack of point-in-time online/offline feature parity
- Communication cost cross teams is not cheap
- Lack of self-service and automation tools for feature life-cycle management and integration
Feature discovery & reuse
- Need feature catalog for discovery and exploration
- Need feature data lineage information
- Lack of unified loading key management cross all checkpoints
Data quality monitoring & alerting
- Need active monitoring & alerting on feature data validation and data shift
6. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases
7. Architecture Visions & Capabilities
• Enable self service for feature engineering with governance
• Leverage unified DSL to define features for both online and offline
• Manage the whole feature life cycle
• Support seamless fast feature deployment & release
• Enable online/offline data parity, support feature point-in-time & backfill
• Provide feature catalog for discovery and reuse cross eBay
8. Taichi Feature Store
Feature versioning Online/offline feature parity
Feature lifecycle management Fast feature backfill
Feature consistency Feature access control
Feature discovery and reuse Feature documentation & analysis
9. Feature Store in Industry
• Uber: Michelangelo Palette
• Airbnb: zipline A Declarative Feature Engineering Framework
• Google: Vertex Feature Store
• AWS: SageMaker Feature Store
• Databricks: https://www.databricks.com/product/feature-store
• Feast/Tecton: https://github.com/feast-dev/feast
• OpenMLDB: https://github.com/4paradigm/OpenMLDB
• Feathr: https://github.com/linkedin/feathr
10. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse
11. Architecture
12. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse
13. Feature Organization
eBay-speci c curated and crowd-sourced feature
Organized as
{domain}:{feature type}:{feature group name}#{feature template name}{^args (optional)}:{key name(s)}
"risk:nrt:fg1#page_view_duration_userid:Buyer.Id"
"ads:batch:PromotedList#page_view_duration_userid:Buyer.Id"
14. Key Management
Manage all the loading keys
Each key should have its unique name, belong to a key dimension
Provide catalog for key discovery and life-cycle management
15. Feature Definition
• Strong data types • UDF extension
• Interoperable with Java • High execution efficiency
• Static validation and dependency extraction
16. Feature Definition
17. Feature Life Cycle Management
Centralized con guration and metadata-driven design
for feature lifecycle management
18. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse
19. Batch Feature
20. NRT(Near-real-time) Feature
21. NRT(Near-real-time) Feature
Generic pipeline with built-in roll-up variable types support
• Sliding Window
• LastK
• Time Decay
• Event Sequencing
22. NRT Feature - Write & Read
23. MDC Strategy
24. On-the-fly Derived Feature
25. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse
26. Online Serving
• Feature in production: reliability, scale, low latency
• Complex data type support
• Schema backword compatibility
27. Feature Reuse & Isolation
28. Offline Training Set
Features
Driver Set
• “risk:otf:fg1#connected_bad_slr:Buyer.Id”
• “risk:nrt:fg1#sum_lstg_price_by_slr_sw_6h:Seller.Id”
• “risk:batch:fg2#trust_mgid_txn_amt:Buyer.Id”
29. Offline Training with Feature Store
30. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases
31. Feature Catalog
32. Feature Onboarding
33. Feature Point-in-time Simulation
34. Feature Release & Online Serving
35. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases
C
36. Landing Use Cases
Domains: Use Cases:
• Ads, Buyer Experience, Marketing
• Knowledge Hub, Search
• Risk, Shipping, Customer Support •
•
•
•
•
•
•
Promotion Display
Guidance
Merch
Home Page Personalization
Sneaker Size
Image Embedding
…….
37.
38. Offline Feature Store
○ Point-in-time features is not the CURRENT features
39. Batch Feature Point-in-time
• Build on top of HBase/HDFS
• Assuming transaction happened in 20211213 00:00:59,
look up key "ship_addr_cb_txn_var:1.0#7653|20211212 01:10:10".
40.
41. NRT Feature Point-in-time