介绍 AutoPatchBench:一个用于 AI 驱动安全修复的基准测试

  • We are introducing AutoPatchBench, a benchmark for the automated repair of vulnerabilities identified through fuzzing.
  • 我们正在推出 AutoPatchBench,这是一个用于自动修复通过模糊测试识别的漏洞的基准。
  • By providing a standardized benchmark, AutoPatchBench enables researchers and practitioners to objectively evaluate and compare the effectiveness of various AI program repair systems. 
  • 通过提供标准化基准,AutoPatchBench 使研究人员和从业者能够客观评估和比较各种 AI 程序修复系统的有效性。
  • This initiative facilitates the development of more robust security solutions, and also encourages collaboration within the community to address the critical challenge of software vulnerability repair.
  • 这一倡议促进了更强大安全解决方案的开发,并鼓励社区内的合作,以应对软件漏洞修复的关键挑战。
  • AutoPatchBench is available now on GitHub.
  • AutoPatchBench 现在可以在 GitHub.

AI is increasingly being applied to solve security challenges, including repairing vulnerabilities identified through fuzzing. However, the lack of a standardized benchmark for objectively assessing AI-driven bug repair agents specific to fuzzing has impeded progress in academia and the broader community. Today, we are publicly releasing AutoPatchBench, a benchmark designed to evaluate AI program repair systems. AutoPatchBench sits within CyberSecEval 4, Meta’s new benchmark suite for evaluating AI capabilities to support defensive use cases. It features 136 fuzzing-identified C/C++ vulnerabilities in real-world code repos along with verified fixes sourced from the ARVO dataset

AI 正越来越多地应用于解决安全挑战,包括修复通过模糊测试识别的漏洞。然而,缺乏一个标准化的基准来客观评估特定于模糊测试的 AI 驱动的错误修复代理,阻碍了学术界和更广泛社区的进展。今天,我们公开发布了 AutoPatchBench,这是一个旨在评估 AI 程序修复系统的基准。AutoPatchBench 位于 CyberSecEval 4 中,这是 Meta 新的基准套件,用于评估 AI 在支持防御用例方面的能力。它包含 136 个在真实代码库中通过模糊测试识别的 C/C++ 漏洞,以及来自 ARVO 数据集 的经过验证的修复。

AutoPatchBench provides a standardized evaluation framework for assessing the effectiveness of AI-assisted vulnerability repair tools. This benchmark aims to facilitate a comprehensive understanding of the capabilities and limitations of various AI-driven approaches to repairing fuzzing-found bugs. By offering a consistent set of evaluation criteria, Au...

开通本站会员,查看完整译文。

inicio - Wiki
Copyright © 2011-2025 iteam. Current version is 2.143.0. UTC+08:00, 2025-05-03 15:45
浙ICP备14020137号-1 $mapa de visitantes$