How Uber Achieves Operational Excellence in the Data Quality Experience
摘要
Uber delivers efficient and reliable transportation across the global marketplace, which is powered by hundreds of services, machine learning models, and tens of thousands of datasets. While growing rapidly, we’re also committed to maintaining data quality, as it can greatly impact business operations and decisions. Without data quality guarantees, downstream service computation or machine learning model performance quickly degrade, which requires a lot of laborious manual efforts to investigate and backfill poor data. In the worst cases, degradations could go unnoticed, silently resulting in inconsistent behaviors.
This led us to build a consolidated data quality platform (UDQ), with the purpose of monitoring, automatically detecting, and handling data quality issues. With the goal of building and achieving data quality standards across Uber, we have supported over 2,000 critical datasets on this platform, and detected around 90% of data quality incidents. In this blog, we describe how we created data quality standards at Uber and built the integrated workflow to achieve operational excellence.
欢迎在评论区写下你对这篇文章的看法。