想要一个简单易用的实时ML平台?你必须知道的一些复杂细节
如果无法正常显示,请先停止浏览器的去广告插件。
1. Invisible Interfaces
Considerations for Abstracting Complexities of
a Real-time ML Platform
Zhenzhong Xu
Cofounder & CTO @ claypot.ai
July, 2023
2. The discovery of
something invisible
3. The Invisible Interface
Ubiquitous
Easy and responsive
Just works!
The endeavor to make things useful
4. Real-time Decisions
that powers your business
Fraud prevention Personalization
Trending products
ETA
Customer support Dynamic pricing/discounting
Risk Assessment
Account Take Over
Ads
Network analysis
Sentiment analysis
Object
detection
…
5. The world is moving towards real-time
●
●
●
●
●
Instacart: The Journey to Real-Time Machine Learning (2022)
○ Directly reduces millions of fraud-related costs annually.
LinkedIn’s Real-time Anti-abuse (2022)
○ LinkedIn moved from an offline pipeline (hours) to real-time pipeline (minutes), and saw
30% increase in bad actors caught online and 21% improvement in fake account detection.
How WhatsApp catches and fights abuse (2022 | slides)
○ A few 100ms delay can increase the spam by 20-30%.
How Pinterest Leverages Realtime User Actions in Recommendation to Boost Engagement (2022)
○ According to Pinterest, this “has been one of our most impactful innovations recently,
increasing Home feed engagement by 11% while reducing Pinner hide volume by 10%.”
Airbnb: Real-time Personalization using Embeddings for Search Ranking (2018)
○ Moving from offline scoring to online scoring grows bookings by +5.1%
5
6. Real-time Decisions
Exploration &
Research
Model Architecture
& Turning
Model Analysis
& Selection
LLM Prompt
Engineering
Data Fabric for Real-time AI
Data Infrastructure
Data Sources Ingestion &
Transport Storage Query & Compute
Workflow
Orchestration Analytics /
Visualization Multi-tenancy
Isolation Security &
Governance
7. Prediction Input
Data Flow
Model Serving
Product
Ecosystem
Data
Model Evaluation
Model Flow
Model Training
Training Input
Model
Monitoring
Data
Monitoring
Analytics
ecosystem
8. The hard things towards
real-time decisions
●
●
●
Data silo and staleness
Collaboration overhead
Tech complexity
9.
10. Challenge 1 : From
Experimentation to Production
●
●
●
Slow prototyping
Local vs. remote execution
Divergent language & runtime
11. Local Experimentation with Traditional Models
12. Local Experimentation with LLMs
13.
14. Feature API
Data
scientists
Central repo
Create, experiment, &
deploy features
Prediction
service
Feature catalog
Computation engines
Feature store
online + offline
Sources
Training
service
15. Need an invisible interface to plug into compute ecosystems
Local/Single Machine
Remote/Distributed
16. Declare features with familiar APIs
@transformation
def average_transaction_amount_by_merchant(
tx: Transactions,
wspec: WindowSpec):
return tx.groupby(["cc_num", "merchant"])["amt"].window(wspec).mean()
17. Data Science Friendly: Python <> SQL
@transformation
def transaction_count(tx: Transactions, wspec: WindowSpec):
return tx[tx.status == "failed"].groupby("account_id").window(wspec).count()
Relational
Expression
Workload Compiler /
Optimizer
Deployment
17
18. Same code can run on different computation engines
@transformation
def transaction_count(tx: Transactions, wspec: WindowSpec):
return tx[tx.status == "failed"].groupby("account_id").window(wspec).count()
Intermediate
Representation
Relational
Expression
Workload
Compiler/Optimizer
Deployment
Compile into a relational expression (RE), which
is SQL equivalent
Compile & optimize RE into the computation
engine
(e.g., Panda, DuckDb, Flink, Spark) best suited for
the job
Spin up and manage computation jobs
19. Solution 1 : Relational
Expression based Compilation
●
●
●
●
Unified yet familiar API
Pluggable to many compute engines
Minimize human error
Prototype in minutes
20. Challenge 2: Streaming and
Batch Divided
●
●
●
Evolving architecture
Difficult to backfill
Train-predict inconsistencies
21. Lambda Architecture
Online Query
(serving)
In-motion Compute
Online
Storage
Mixed Query (backfill)
Data Source
At-rest Compute
Offline
Storage
Offline Query
(training)
22. Kappa (Streaming) Architecture
Online Query
(serving)
streaming transformation
Data Source
In-motion Compute
(Backfill from historical log)
Materialized
Views
batch transformation
Offline Query
(training)
23. Unified Architecture
Online Query
(serving)
streaming transformation
In-motion Compute
(intelligent backfill from dual
sources)
Data Source
Materialized
Views
Backing
batch transformation
DWH backed
logs
Offline Query
(training)
24. Batch and streaming source unified to simplify backfill
Dual source
cutover
Stream
DWH
Time
25. Need an invisible interface to plug into storage ecosystems
Streaming Leaning
Batch Leaning
26. Data Fabric for a Streaming Pipeline
27. Data Fabric for a Unified Backfill Pipeline
28. Training dataset backfill requires point-in-time correctness
Prediction
events
Feature data
Feature data
Feature data
Feature data
Time
29. Point-in-time joins to generate training data
Given a spine (entity keys + timestamp + label), join features to generate training data
cc_num_tx_max_1h
spine_df
user_unique_id_30d
inference_ts tid cc_num user_id is_fraud ts cc_num tx_max_1h ts user_id unique_ip_30d
21:30 0122 2 1 0 9:20 2 … 6:00 1 …
21:40 0298 4 1 0 10:24 2 … 6:00 3 …
21:55 7539 6 3 1 20:00 4 … 6:00 5 …
train_df = pitc_join_features(
spine_df,
features=[
"tx_max_1h",
"user_unique_ip_30d",
],
)
inference_ts tid cc_num user_id is_fraud tx_max_1h user_unique_ip_30d
21:30 0122 2 1 1 … …
21:40 0298 4 1 1 … …
21:55 7539 6 3 3 … …
Proprietary & Confidential
29
30. Solution 2: Abstract streaming
and batch data storage
●
●
●
Unified streaming & batch source
Unified online & offline feature stores
Pluggable to most storage technologies
31. Challenge 3: It should just work!
● Cost, latency, correctness surprises!
● Lack optimizations knobs
32. Stream processing without
consistency
(fast and cheap)
Cost
Batch processing
(cheap and correct)
Latency
Correctness
Stream processing with
consistency enforced
(fast and correct)
33. Optimization
@transformation
def transaction_count(tx: Transactions, wspec: WindowSpec):
return tx[tx.status == "failed"].groupby("account_id").window(wspec).count()
Relational
Expression
Workload Compilation
Optimization
Deployment
Various intelligent optimization can be done to
make appropriate tradeoff across storage and
compute systems.
34. Claypot Feature
SDK (Python)
Feature Catalog
Guardrail for schema changes
Tunable workload optimization
Unified Processing
Filter
Customer managed in your own cloud
Feature
Serving
Offline
store
Union
Join
Filter
Online
store
Scan
Scan
35. Solution 3: Optimization knobs
●
●
●
Abstract optimization complexity
User controls with high level knobs
Trust, no surprises!
36. Make invisible interface
possible!
●
●
●
Ubiquitous
Easy and responsive
Just works!
https://zhenzhongxu.com/
zhenzhong@claypot.ai
the invisible interface