在Meta使用HawkEye进行AI调试

HawkEye is the powerful toolkit used internally at Meta for monitoring, observability, and debuggability of the end-to-end machine learning (ML) workflow that powers ML-based products.
HawkEye是Meta内部用于监控、可观测性和调试性的强大工具包，用于支持端到端机器学习（ML）工作流程，为基于ML的产品提供动力。
HawkEye supports recommendation and ranking models across several products at Meta. Over the past two years, it has facilitated order of magnitude improvements in the time spent debugging production issues.
HawkEye支持Meta的多个产品中的推荐和排序模型。在过去的两年中，它在调试生产问题上节省了数量级的时间。
In this post, we will provide an overview of the end-to-end debugging workflows supported by HawkEye, components of the system, and the product surface for Meta product and monetization teams to debug AI model and feature issues.
在本文中，我们将概述HawkEye支持的端到端调试工作流程、系统组件以及Meta产品和货币化团队用于调试AI模型和特征问题的产品界面。

Many of Meta’s products and services leverage ML for various tasks such as recommendations, understanding content, and generating content. Workflows to productionize ML models include data pipelines to get the information needed to train the models, training workflows to build and improve the models over time, evaluation systems to test the models, and inference workflows to actually use the models in Meta’s products. At any point in time multiple versions (snapshots) of a model could be hosted as A/B experiments to test their performance and accuracy.

Meta的许多产品和服务利用机器学习来完成各种任务，例如推荐、内容理解和内容生成。将机器学习模型投入生产的工作流程包括数据流程，以获取训练模型所需的信息，训练工作流程，以随时间建立和改进模型，评估系统，以测试模型，以及推理工作流程，以实际在Meta的产品中使用模型。在任何时间点，模型的多个版本（快照）都可以作为A/B实验进行测试其性能和准确性。

Ensuring the robustness of predictions made by models is crucial for providing engaging user experiences and effective monetization. However, several factors can affect the accuracy of these predictions, such as the distribution of data used for training, inference-time inputs, the (hyper)parameters of the model, and the systems configuration. Identifying the root cause of any issue is a complex problem, especially given the scale of Meta’s models and data.

确保模型预测的稳健性对于提供引人入胜的用户体验和有效的货币化至关重要。然而，许...