why GNU grep is fast

摘要

Here's a quick summary of where GNU grep gets its speed. Hopefully you can carry these ideas over to BSD grep.

#1 trick: GNU grep is fast because it AVOIDS LOOKING AT EVERY INPUT BYTE.

#2 trick: GNU grep is fast because it EXECUTES VERY FEW INSTRUCTIONS FOR EACH BYTE that it *does* look at.

GNU grep uses the well-known Boyer-Moore algorithm, which looks first for the final letter of the target string, and uses a lookup table to tell it how far ahead it can skip in the input whenever it finds a non-matching character.

GNU grep also unrolls the inner loop of Boyer-Moore, and sets up the Boyer-Moore delta table entries in such a way that it doesn't need to do the loop exit test at every unrolled step. The result of this is that, in the limit, GNU grep averages fewer than 3 x86 instructions executed for each input byte it actually looks at (and it skips many bytes entirely).

欢迎在评论区写下你对这篇文章的看法。

评论

- 위키
Copyright © 2011-2024 iteam. Current version is 2.137.1. UTC+08:00, 2024-11-08 22:36
浙ICP备14020137号-1 $방문자$