构建企业级 AI 平台的架构策略和实践
如果无法正常显示,请先停止浏览器的去广告插件。
1. Strategies of
Machine Learning Platform
Building & Practices in eBay
eBay AIP Chief Architect, CCOE VAT Chairman / Bruce Li
2.
3. Agenda
AI Platform
vision, design
principles and
core capabilities
1
AI/ML use case
analysis
2
3
Unified data
strategies
4. AI Use Cases
Structured Data
Semi/Unstructured Data
(image/video/text/3D/…)
Data Source
-
-
-
Online data services – OTF FE
Streaming events – NRT FE
Offline batch/ETL datasets – Batch FE
-
Content generation/acquisition NRT pipeline
Storage Unified online/offline feature store Unified online/offline content store
Data PiT Parity Online/offline PiT data strategies PiT data parity is not required
Vendor/manual/auto labelling
Feedback Loop - - Short: Continuous online training
Long: Offline PiT feature simulation
CPU/GPU - CPU training and inferencing typically
Common
-
GPU training and inferencing typically
Driver set & training set generation & management, catalog, data
lineage, etc.
5. Challenges of Building Enterprise ML Platform
Tends to invest
more on solutions
instead of platform
Lack of clear
boundary
between solutions
and platform
Lack of unified
data strategies
and self-service
support for ML
Platform building
Traditionally focus
more on training,
lack of enough
platform support
on data/feature
and inferencing
Lack of E2E
seamless
integration
strategies cross
feature, training
and inferencing
6. ML Development Lifecycle
7. Agenda
AI Platform vision,
design principles
and core
capabilities
1
AI/ML use case
analysis
2
3
Unified data
strategies
8. Our Vision
To empower eBay AI practitioners to build, train and
deploy machine learning models with fully-managed,
efficient and self-service platform at scale.
9. ML Platform Core Capability Map
10. ML Platform Architectural Principles
Enable self-service based on centralized configuration and metadata-driven design, with
lifecycle management and governance in place
Enable unified metadata and definitions cross online and offline, with enough
flexibility and extensibility to support domain level customizations
Provide a group of management APIs & services for MLP managed lifecycle , and enable
the E2E seamless integration based on the APIs
Provide unified catalogs (including data, stored variables, features, models, solutions,
etc.) to promote discovery, reuse and better governance
Provide E2E data lineages for the AI Platform domain entities
Apply unified monitoring cross the whole ML platform
11. ML Platform Online Integration Architecture
12. Entity Modeling in ML Platform
13. Dependency DAG & Execution Plan
14. Unified CPU/GPU Inferencing Platform
15. Model and Feature Monitoring
16. Agenda
AI Platform
vision, design
principles and
core capabilities
1
AI/ML use case
analysis
2
3
Unified data
strategies
17. Why Data Strategies are so Important for AI/ML
Image source: Cognilytica, from https://www.ayadata.ai/blog-posts/manual-vs-automated-data-labeling
18. Batch Feature
Feature DSL
19. NRT Roll-up Abstraction
20. NRT Feature Engineering
21. NRT Feature
Schema
Derived Computation
Event processing
22. On-the-fly Feature
23. Comparisons of Different Features Types
Batch Feature NRT Feature On-the-fly Feature
MLP Managed Yes Yes No
Self-service by End Users
(DS) Yes Yes No
Delay of Data Freshness 1 Day+ P99 < 5 sec Real-time
Data Source ETL/Batch
data/Snapshotted Dataset Enriched events Request context /
Online data services
Online/offline PiT Strategy PiT Simulation / Feature
Snapshotting PiT Simulation / Feature
Snapshotting Feature Snapshotting Only
Reusability Easy to reuse Easy to reuse Solution by solution support
Fast Fast except new enriched
event acquisition Slow
Time-to-Market
24. Embracing NRT Strategy
25. Integrated Data Strategies
Feature Platform Training Platform Unified Feature Store Training Set Generation Feature/Model Snapshotting
Feature Lifecyle Mngt. Driver/training Set Mngt. Unified Model Spec
Feature PiT Simulation High-throughput Data Access API Spec Auto-Gen
Inferencing Platform
26.