why GNU grep is fast

出处：lists.freebsd.org

存档：存档

译文：中文

摘要

Here's a quick summary of where GNU grep gets its speed. Hopefully you can carry these ideas over to BSD grep.

#1 trick: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE.

#2 trick: GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

GNU grep uses the well-known Boyer-Moore algorithm, which looks first for the final letter of the target string, and uses a lookup table to tell it how far ahead it can skip in the input whenever it finds a non-matching character.

GNU grep also unrolls the inner loop of Boyer-Moore, and sets up the Boyer-Moore delta table entries in such a way that it doesn't need to do the loop exit test at every unrolled step. The result of this is that, in the limit, GNU grep averages fewer than 3 x86 instructions executed for each input byte it actually looks at (and it skips many bytes entirely).

阅读原文

xiaozi 于 2022-06-27 分享

7084

欢迎在评论区写下你对这篇文章的看法。

why GNU grep is fast

why GNU grep is fast

摘要

评论

文库