构建一个大规模的无监督模型异常检测系统--第二部分

By Rajeev Prabhakar, Han Wang, Anindya Saha

作者:Rajeev Prabhakar,Han Wang,Anindya Saha

A camera lens looking at a city downtown

Photo by Octavian Rosca on Unsplash

照片:Octavian RoscaonUnsplash

In our previous blog we discussed the different challenges we faced for model monitoring and our strategy for addressing some of these problems. We briefly mentioned using z-scores to identify anomalies. In this post, we will dive deeper into anomaly detection and building a culture of observability.

在我们之前的博客中,我们讨论了我们在模型监测方面所面临的不同挑战以及我们解决其中一些问题的策略。我们简单地提到了使用Z-cores来识别异常。在这篇文章中,我们将深入探讨异常检测和建立一个可观察性的文化。

Model observability is often neglected but is critical in the machine learning model lifecycle. Developing a good observability strategy helps to narrow down problems quickly at their roots and take appropriate actions such as model retraining, improving feature selection, and troubleshooting feature drifts.

模型的可观察性经常被忽视,但在机器学习模型的生命周期中却至关重要。制定一个好的可观察性策略有助于快速缩小问题的根源,并采取适当的行动,如模型再训练、改进特征选择和排除特征漂移的故障。

The example below is what our finished product looks like. The highlighted regions are the timeframes where anomalies were detected. With a dashboard that contains the corresponding features, it becomes quick to diagnose the root cause of the anomaly. This blog post discusses the approach to building a fully automated solution that finds and explains anomalies.

下面的例子是我们完成的产品的样子。突出显示的区域是检测到异常情况的时间范围。有了包含相应功能的仪表盘,诊断异常现象的根本原因就变得很迅速。这篇博文讨论了建立一个发现和解释异常情况的全自动解决方案的方法。

Utilizing Data Profiling

利用数据分析

In our part-1 blog, we talked about the importance of data profiling.

在我们的第一部分博客中,我们谈到了数据分析的重要性。

Although it is common practice to monitor anomalies based on specific aggregated metrics on raw data, the question remains, which metrics are helpful. For outliers, minimum, maximum, and 99th percentile metrics are very useful. For numerical distribution drifts, mean and median are effective metrics. For categorical data, frequent items are useful for detecting categorical drift and cardinality for the data quality overall. It is evident that various metrics are needed, and these requirements ...

开通本站会员,查看完整译文。

首页 - Wiki
Copyright © 2011-2024 iteam. Current version is 2.124.0. UTC+08:00, 2024-04-25 15:42
浙ICP备14020137号-1 $访客地图$