在JavaScript中使用正则表达式的威胁
Regular expressions or regex are widely used in web development for pattern matching and validation purposes. However, using them in practice comes with several security and performance risks that could open doors for attackers.
正则表达式(regex)在网络开发中被广泛用于模式匹配和验证目的。然而,在实践中使用它们会带来一些安全和性能风险,可能为攻击者打开大门。
So, in this article, I will discuss two fundamental issues you need to be aware of before using regular expressions in JavaScript.
因此,在这篇文章中,我将讨论在JavaScript中使用正则表达式之前需要注意的两个基本问题。
There are two regular expression algorithms out there,
目前有两种正则表达式算法。
- Deterministic Finite Automaton (DFA) — Checks a character in a string only once.
- 确定性有限自动机(DFA)--对字符串中的一个字符只检查一次。
- Nondeterministic Finite Automaton (NFA) — Checks a character multiple times until the best match is found.
- 非决定性有限自动机(NFA)--多次检查一个字符,直到找到最佳匹配。
JavaScript uses the NFA approach in its regex engine, and this NFA behavior causes catastrophic backtracking.
JavaScript在其regex引擎中使用了NFA方法,而这种NFA行为会导致灾难性的回溯。
To get a better understanding, let’s consider the following regex.
为了更好地理解,让我们考虑一下下面这个词组。
This regex seems simple. But don’t underestimate, it can cost you a lot ?. So first, let’s understand the meaning behind this regex.
这个词组看起来很简单。但不要低估,它可以让你付出很多?。因此,首先,让我们了解这个词组背后的含义。
- (g|i+) — This is a group that checks if a given string starts with 'g' or one or more occurrences of 'i'.
- (g|i+) -这是一个检查给定字符串是否以'g'或一个或多个'i'开头的组。
- The next '+' will check for one or more appearances of the previous group.
- 下一个'+'将检查前一个组的一个或多个出现情况。
- The string should end with the letter 't.'
- 该字符串应以字母't'结束。
The following texts will evaluate as valid under the above regex.
下面的文本在上述重码下将被评估为有效。
Now let’s check the time taken to execute the above regex on a valid string. I will use the console.time()
method.
现在让我们来检查一下在一个有效的字符串上执行上述重码的时间。我将使用console.time()
方法。
Valid text
有效文本
Here we can see that the execution is pretty fast, even though the string is a bit long.
在这里我们可以看到,尽管字符串有点长,但执行起来还是相当快。
But you will be surprised when you see the time taken to validate an invalid text.
但当你看到验证一个无效文本所需的时间时,你会感到惊讶。
In the be...