Unified Feature Store
如果无法正常显示,请先停止浏览器的去广告插件。
        
                1. Unified Feature Store
@ eBay
Yucai Yu
eBay AI Architect            
                        
                2.             
                        
                3. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases            
                        
                4. AI Project Life Cycle
Applied machine learning is basically feature engineering
- Andrew Ng            
                        
                5. Major Pain Points in Feature
Long TTM (Time-to-Market)
- Lack of point-in-time online/offline feature parity
- Communication cost cross teams is not cheap
- Lack of self-service and automation tools for feature life-cycle management and integration
Feature discovery & reuse
- Need feature catalog for discovery and exploration
- Need feature data lineage information
- Lack of unified loading key management cross all checkpoints
Data quality monitoring & alerting
- Need active monitoring & alerting on feature data validation and data shift            
                        
                6. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases            
                        
                7. Architecture Visions & Capabilities
• Enable self service for feature engineering with governance
• Leverage unified DSL to define features for both online and offline
• Manage the whole feature life cycle
• Support seamless fast feature deployment & release
• Enable online/offline data parity, support feature point-in-time & backfill
• Provide feature catalog for discovery and reuse cross eBay            
                        
                8. Taichi Feature Store
Feature versioning Online/offline feature parity
Feature lifecycle management Fast feature backfill
Feature consistency Feature access control
Feature discovery and reuse Feature documentation & analysis            
                        
                9. Feature Store in Industry
• Uber: Michelangelo Palette
• Airbnb: zipline A Declarative Feature Engineering Framework
• Google: Vertex Feature Store
• AWS: SageMaker Feature Store
• Databricks: https://www.databricks.com/product/feature-store
• Feast/Tecton: https://github.com/feast-dev/feast
• OpenMLDB: https://github.com/4paradigm/OpenMLDB
• Feathr: https://github.com/linkedin/feathr            
                        
                10. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse            
                        
                11. Architecture            
                        
                12. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse            
                        
                13. Feature Organization
eBay-speci c curated and crowd-sourced feature
Organized as
{domain}:{feature type}:{feature group name}#{feature template name}{^args (optional)}:{key name(s)}
"risk:nrt:fg1#page_view_duration_userid:Buyer.Id"
"ads:batch:PromotedList#page_view_duration_userid:Buyer.Id"            
                        
                14. Key Management
Manage all the loading keys
Each key should have its unique name, belong to a key dimension
Provide catalog for key discovery and life-cycle management            
                        
                15. Feature Definition
• Strong data types • UDF extension
• Interoperable with Java • High execution efficiency
• Static validation and dependency extraction            
                        
                16. Feature Definition            
                        
                17. Feature Life Cycle Management
Centralized con guration and metadata-driven design
for feature lifecycle management            
                        
                18. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse            
                        
                19. Batch Feature            
                        
                20. NRT(Near-real-time) Feature            
                        
                21. NRT(Near-real-time) Feature
Generic pipeline with built-in roll-up variable types support
• Sliding Window
• LastK
• Time Decay
• Event Sequencing            
                        
                22. NRT Feature - Write & Read            
                        
                23. MDC Strategy            
                        
                24. On-the-fly Derived Feature            
                        
                25. Unified Feature Store @ eBay
• Architecture
• Feature Management
• Feature Engineering & Parity
• Feature Serving & Reuse            
                        
                26. Online Serving
• Feature in production: reliability, scale, low latency
• Complex data type support
• Schema backword compatibility            
                        
                27. Feature Reuse & Isolation            
                        
                28. Offline Training Set
Features
Driver Set
• “risk:otf:fg1#connected_bad_slr:Buyer.Id”
• “risk:nrt:fg1#sum_lstg_price_by_slr_sw_6h:Seller.Id”
• “risk:batch:fg2#trust_mgid_txn_amt:Buyer.Id”            
                        
                29. Offline Training with Feature Store            
                        
                30. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases            
                        
                31. Feature Catalog            
                        
                32. Feature Onboarding            
                        
                33. Feature Point-in-time Simulation            
                        
                34. Feature Release & Online Serving            
                        
                35. Agenda
• Feature Pain Points in Production AI
• Unified Feature Store @ eBay
• Demo
• Landing Use Cases
C            
                        
                36. Landing Use Cases
Domains: Use Cases:
• Ads, Buyer Experience, Marketing
• Knowledge Hub, Search
• Risk, Shipping, Customer Support •
•
•
•
•
•
•
Promotion Display
Guidance
Merch
Home Page Personalization
Sneaker Size
Image Embedding
…….            
                        
                37.             
                        
                38. Offline Feature Store
○ Point-in-time features is not the CURRENT features            
                        
                39. Batch Feature Point-in-time
• Build on top of HBase/HDFS
• Assuming transaction happened in 20211213 00:00:59,
look up key "ship_addr_cb_txn_var:1.0#7653|20211212 01:10:10".            
                        
                40.             
                        
                41. NRT Feature Point-in-time