自带异常检测算法
Charles Wu | Software Engineer; Isabel Tallam | Software Engineer; Kapil Bajaj | Engineering Manager
Charles Wu | 软件工程师;Isabel Tallam | 软件工程师;Kapil Bajaj | 工程经理
Overview
概述
In this blog, we present a pragmatic way of integrating analytics, written in Python, with our distributed anomaly detection platform, written in Java. The approach here could be generalized to integrate processing done in one language/paradigm into a platform in another language/paradigm.
在本文中,我们提出了一种实用的方法,将用Python编写的分析与用Java编写的分布式异常检测平台集成。这种方法可以推广到将在一种语言/范例中完成的处理集成到另一种语言/范例的平台中。
Background
背景
Warden is the distributed anomaly detection platform at Pinterest. It aims to be fast, scalable, and end-to-end: starting from fetching the data from various data sources to be analyzed, and ending with pushing result notifications to tools like Slack.
Warden 是 Pinterest 的分布式异常检测平台。它旨在快速、可扩展和端到端:从获取要分析的各种数据源的数据开始,到将结果通知推送到诸如 Slack 的工具。
Warden started off as a Java Thrift service built around the EGADs open-source library, which contains Java implementations of various time-series anomaly detection algorithms.
Warden 最初是围绕 EGADs 开源库构建的 Java Thrift 服务,该库包含各种时间序列异常检测算法的 Java 实现。
The execution flow of one anomaly detection job, defined by one JSON job spec. Each job is load-balanced to a node in the Warden cluster.
一个异常检测作业的执行流程,由一个JSON作业规范定义。每个作业都会负载均衡到Warden集群中的一个节点。
Warden has played an important role at Pinterest; for example, it was used to catch spammers. Over time, we have built more features and optimizations into the Warden platform, such as interactive data visualizations, query pagination, and sending customized notification messages. We have also found it useful to have Warden as a separate Thrift service as it gives us more flexibility to scale it by adding or removing nodes in its clusters, to call it via a Thrift client from a variety of places, and to add instrumentations for better monitoring.
Warden在Pinterest中发挥了重要作用;例如,它被用于捕捉垃圾邮件发送者。随着时间的推移,我们在Warden平台上构建了更多功能和优化,例如交互式数据可视化、查询分页和发送定制通知消息。我们还发现将Warden作为单独的Thrift服务很有用,因为它使我们能够通过在其集群中添加或删除节点来扩展它...