构建企业级 AI 平台的架构策略和实践

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
1. Strategies of Machine Learning Platform Building & Practices in eBay eBay AIP Chief Architect, CCOE VAT Chairman / Bruce Li
2.
3. Agenda AI Platform vision, design principles and core capabilities 1 AI/ML use case analysis 2 3 Unified data strategies
4. AI Use Cases Structured Data Semi/Unstructured Data (image/video/text/3D/…) Data Source - - - Online data services – OTF FE Streaming events – NRT FE Offline batch/ETL datasets – Batch FE - Content generation/acquisition NRT pipeline Storage Unified online/offline feature store Unified online/offline content store Data PiT Parity Online/offline PiT data strategies PiT data parity is not required Vendor/manual/auto labelling Feedback Loop - - Short: Continuous online training Long: Offline PiT feature simulation CPU/GPU - CPU training and inferencing typically Common - GPU training and inferencing typically Driver set & training set generation & management, catalog, data lineage, etc.
5. Challenges of Building Enterprise ML Platform Tends to invest more on solutions instead of platform Lack of clear boundary between solutions and platform Lack of unified data strategies and self-service support for ML Platform building Traditionally focus more on training, lack of enough platform support on data/feature and inferencing Lack of E2E seamless integration strategies cross feature, training and inferencing
6. ML Development Lifecycle
7. Agenda AI Platform vision, design principles and core capabilities 1 AI/ML use case analysis 2 3 Unified data strategies
8. Our Vision To empower eBay AI practitioners to build, train and deploy machine learning models with fully-managed, efficient and self-service platform at scale.
9. ML Platform Core Capability Map
10. ML Platform Architectural Principles Enable self-service based on centralized configuration and metadata-driven design, with lifecycle management and governance in place Enable unified metadata and definitions cross online and offline, with enough flexibility and extensibility to support domain level customizations Provide a group of management APIs & services for MLP managed lifecycle , and enable the E2E seamless integration based on the APIs Provide unified catalogs (including data, stored variables, features, models, solutions, etc.) to promote discovery, reuse and better governance Provide E2E data lineages for the AI Platform domain entities Apply unified monitoring cross the whole ML platform
11. ML Platform Online Integration Architecture
12. Entity Modeling in ML Platform
13. Dependency DAG & Execution Plan
14. Unified CPU/GPU Inferencing Platform
15. Model and Feature Monitoring
16. Agenda AI Platform vision, design principles and core capabilities 1 AI/ML use case analysis 2 3 Unified data strategies
17. Why Data Strategies are so Important for AI/ML Image source: Cognilytica, from https://www.ayadata.ai/blog-posts/manual-vs-automated-data-labeling
18. Batch Feature Feature DSL
19. NRT Roll-up Abstraction
20. NRT Feature Engineering
21. NRT Feature Schema Derived Computation Event processing
22. On-the-fly Feature
23. Comparisons of Different Features Types Batch Feature NRT Feature On-the-fly Feature MLP Managed Yes Yes No Self-service by End Users (DS) Yes Yes No Delay of Data Freshness 1 Day+ P99 < 5 sec Real-time Data Source ETL/Batch data/Snapshotted Dataset Enriched events Request context / Online data services Online/offline PiT Strategy PiT Simulation / Feature Snapshotting PiT Simulation / Feature Snapshotting Feature Snapshotting Only Reusability Easy to reuse Easy to reuse Solution by solution support Fast Fast except new enriched event acquisition Slow Time-to-Market
24. Embracing NRT Strategy
25. Integrated Data Strategies Feature Platform Training Platform Unified Feature Store Training Set Generation Feature/Model Snapshotting Feature Lifecyle Mngt. Driver/training Set Mngt. Unified Model Spec Feature PiT Simulation High-throughput Data Access API Spec Auto-Gen Inferencing Platform
26.

Home - Wiki
Copyright © 2011-2024 iteam. Current version is 2.129.0. UTC+08:00, 2024-06-29 19:50
浙ICP备14020137号-1 $Map of visitor$