基于元数据和配置驱动的 eBay 交易风控 AI 模型管理和部署实践

如果无法正常显示,请先停止浏览器的去广告插件。
分享至:
1. Model Spec Driven AI Model Management & Deployment at eBay Payments Risk Bing Wang eBay Payments & Risk
2.
3. Agenda eBay Payments Risk AI Model Lifecycle and 1 Model Spec Unified Context for Model Training & Model Serving 3 Model Integration & Deployment by Model Spec 4 Model Serving Observability & Monitoring 2
4. eBay Payments Risk AI Model Lifecycle and Model Spec
5. Payments Risk AI Model Lifecycle Model Refresh(Refit) Feature Engineering Offline Model Training Model Deployment Performance Validation Business Usage Online
6. Metadata in AI Model Lifecycle Model Refresh(Refit) Feature Engineering Model Training Offline Raw Features Metadata Training Dataset Metadata, Pipeline Metadata, Model Object Metadata,… Model Deployment Performance Validation Business Usage Online Model Service API Metadata, Features Fetching Metadata, Feature Preprocessing Metadata, Model Prediction Metadata, Model Output Post-processing Metadata
7. Metadata in AI Model Lifecycle Model Refresh(Refit) Feature Engineering Model Training Offline Raw Features Metadata Model Deployment Performance Validation Business Usage Online Model Service API Metadata, Features Fetching Metadata, Model Spec Training Dataset Metadata, Pipeline (Model Specification) Feature Preprocessing Metadata, Model Prediction Metadata, Model Metadata, Model Object Metadata,… Output Post-processing Metadata
8. Metadata Group - Model Spec Model Spec (Model Specification) Basic Model Information: owners, model type, scenario, refresh frequency, ... Feature Fetching: feature name, data source, value type, default value, ... Inference Preprocessing & Post- processing: dependent raw features, feature preprocessing expression, model output mapping logics, … Model Object: model type, framework and version, parameters, target SLA, … Multi-models Inference: pipeline definition, model routing definition Monitoring and Logging: schema definition, metrics, event/table information, …
9. Unified Context for Model Training & Model Serving
10. Model Deployment translation Training Outputs Model Training Pipeline Model Application Codes Training codes Deploy copy Model Service Model files (pkl, txt, json, bin) Model files (pkl, txt, json, bin)
11. Model Deployment translation Training Outputs Model Training Pipeline Model Application Codes Training codes Deploy copy Model Service Model files (pkl, txt, json, bin) Model files (pkl, txt, json, bin) • Much Manual Effort • Vulnerable to discrepancy between model training and inference
12. Model Integration deploy translation request Features Fetching Model API Call Features & Inference Result Monitoring Inference request Model Service Business Domain Service
13. Model Integration deploy translation request Features Fetching Model API Call Features & Inference Result Monitoring Inference request Model Service Business Domain Service • Much Manual Effort • Vulnerable to discrepancy between model training and inference
14. Different Context Model Training (Data Scientists) own context own context Model Integrating (Domain Engineers) own context Model Deploying (Data Engineers)
15. Unified Context – Model Spec Model Training (Data Scientists) Model Integrating (Domain Engineers) Model Spec Model Deploying (Data Engineers)
16. Model Integration & Deployment by Model Spec
17. Feature Preprocessing The traditional way to move feature preprocessing logics from model training to inference is serializing/deserializing object by pickle
18. Feature Preprocessing Problem I : Forcing dependency on libraries in different environments Solution in Model Spec : Reproduce preprocessing by Logics Representation Dumping & Saving *.pkl object object Logics Representation Loading & Parsing Logics Representation
19. Feature Preprocessing Problem II : Data processing performance for singleton inference is not optimal Batch Processing Singleton Processing Optimization in Model Spec: concurrency by multiprocessing , std::thread Multi-threading Multi-processing
20. Feature Preprocessing Move feature preprocessing logics from model training to inference by Model Spec
21. Codes to Representations in Model Spec Dumping Parsing Logics Representation Logics Representation Feature Fetching Model Feature Preprocessing Model Object Model Output Post-processing Model Inference Routing
22. Configurations Snapshotting from Model Spec Feature Fetching Configuration Feature Preprocessing Configuration Model Spec Store snapshotting versioning Model Object Configuration Model Ouput Post-preprocessing Configuration ……
23. Configuration Deployment Configuration Deployment in Business Domain Service Configuration Sync Configuration Validation Feature Fetcher Object Building Canary Change Event Dropping Configuration Deployment in Model Service Configuration Sync Configuration Validation Model Inference Session Building Canary Change Event Dropping
24. Configuration Deployment Configuration Deployment in Business Domain Service Configuration Sync Configuration Validation Feature Fetcher Object Building Canary Change Event Dropping Metadata and Configuration Driven, Few Code Changes Needed Configuration Deployment in Model Service Configuration Sync Configuration Validation Model Inference Session Building Canary Change Event Dropping
25. Model Integration & Deployment by Model Spec Model Training Pipeline Model Spec Library Read/Update Model Spec Model Spec Store Read/Update Model Spec Request Domain Service Model Spec Library Read/Update Model Spec Model Inference Request Model Service Model Spec Library
26. 4 Model Serving Observability & Monitoring
27. ML System Monitoring The ML Test Score: A Rubric for ML Production Readiness and Technical Debt Reduction Eric Breck, Shanqing Cai, Eric Nielsen, Michael Salib, D. Sculley Proceedings of IEEE Big Data (2017)
28. ML System Observability & Monitoring Model Output Monitoring: - default result rate - score distribution … Model Features Monitoring: - null / empty rate - value distribution … Model System Monitoring: - latency - error rate …
29. Model Outputs & Features Observability & Monitoring Log Events Apache Flink Kafka Cluster Aggregated Events Monitoring Metadata Events Consumer & Processor NRT Metrics Monitoring Metadata Model Spec Store Monitoring Metadata Log Events Hadoop Hadoop HDFS Offline Metrics
30. Thank You

Accueil - Wiki
Copyright © 2011-2024 iteam. Current version is 2.139.0. UTC+08:00, 2024-12-28 10:58
浙ICP备14020137号-1 $Carte des visiteurs$