D3: An Automated System to Detect Data Drifts

摘要

Data powers almost all critical, customer-facing flows at Uber. Bad data quality impacts our ML models, leading to a bad user experience (incorrect fares, ETAs, products, etc.) and revenue loss.

Still, many data issues are manually detected by users weeks or even months after they start. Data regressions are hard to catch because the most impactful ones are generally silent. They do not impact metrics and ML models in an obvious way until someone notices something is off, which finally unearths the data issue. But by that time, bad decisions are already made, and ML models have already underperformed.

This makes it critical to monitor data quality thoroughly so that issues are caught proactively.

欢迎在评论区写下你对这篇文章的看法。

评论

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.123.4. UTC+08:00, 2024-04-20 13:27
浙ICP备14020137号-1 $访客地图$