正则表达式预检查

发布于 12/17/2019, 6:14:14 AM

名词 {#terms}

以下都是预检查，类似于(?:)非捕获型分组，匹配到的内容不会被捕获

(?=pattern) Positive Lookahead Assert 正向肯定预检查
(?<=pattern) Positive Lookbehind Assert 反向肯定预检查
(?!pattern) Negative Lookahead Assert 正向否定预检查
(?<!pattern) Negative Lookbehind Assert 反向否定预检查

举例说明

1. 普通的捕获型正则

/windows(95|NT|xp)/.exec("windows95OtherString");

console：

普通的捕获型正则 console

regex101：

普通的捕获型正则 regex101

2. 正向肯定预检查 Postive Lookahead

/windows(?=95|NT|xp)/.exec("windows95hahahah");

console：

Positive Lookahead console

regex101：

Positive Lookahead regex101

3. 反向肯定预检查 Positive Lookbehind

/(?<=95|NT|xp)windows/.exec("NTwindows");

console：

Positive Lookbehind console

regex101：

Positive Lookbehind regex101

4. 正向否定预检查 Negative Lookahead

/windows(?!95|NT|xp)/.exec("windows10heihei");

console：

Negative Lookahead console

regex101：

Negative Lookahead regex101

5. 反向否定预检查 Negative Lookbehind

/(?<!95|NT|xp)windows/.exec("haha10windows");

console：

Negative Lookbehind console

regex101：

Negative Lookbehind regex101

总结

其中 ? 表示非捕获型匹配

以从左到右为正方向，lookahead 指的是 括号里面的正则 在 匹配字符串 前方。lookbehind 则表示 括号里面的正则 在 匹配字符串 后方，使用<来表示在后方。

Positive or Negative 分表表示 是 or 否 匹配括号里的正则。符号分别为 = 和 !

由于都是 non-capturing-group(非捕获型分组)，所以结果匹配到的字符串，都不会包括括号里面的，即上面的例子中的 windows。

千位分隔符 Look Ahead Positive Assert {#example2}

"12345678.32423432".replace(/(\d)(?=(\d{3})+\.)/g, "$1,");

解释：

Negative Lookbehind regex101

上面的千分位分隔正则表达式对没有小数点部分的字符串无效，更完整的可以使用如下的方法：

export function toThousands(num) {
  return num.toString().replace(/\d+/, (n) => {
    return n.replace(/\B(?=((\d{3})+\b))/g, ",");
  });
}

toThousands(12345.489101);
// 输出为 '12,345.489101'

Copilot 做如下解释：

Here’s a breakdown of how it works:

num.toString(): This converts the input number to a string. This is necessary because the replace method is a string method and doesn’t work directly on numbers.
.replace(/\d+/, (n) => {...}): This uses a regular expression to match one or more digits in the string. For each match, it calls a function that takes the matched string n as an argument and returns a new string.
Inside the function, n.replace(/\B(?=((\d{3})+\b))/g, ",") is used to add commas in the appropriate places.
- \B matches a position where the previous and next character are either both words or both non-words. This ensures we’re not at the beginning of the string.
- (?=((\d{3})+\b)) is a positive lookahead that matches a group ahead in the string without including it in the result. It looks for one or more groups of three digits (\d{3}) that are not followed by a word boundary (\b). This effectively matches every group of three digits that are at the end of a larger group of digits.
- /g is the global flag for the regular expression, which means the replacement will happen for all matches in the string, not just the first one.
- "," is the replacement string, which will replace each match of the regular expression (i.e., each place where a comma should go).

So, if you call add_comma_every_thousand(1234567), the function will return the string "1,234,567". This function can be useful in a variety of applications where you need to display large numbers in a more readable format.

关于正向/反向更典型的举例

"hello world".replace(/(?=hello)/g, ",");
// 输出为 ',hello world'

"hello world".replace(/(?<=hello)/g, ",");
// 输出为 'hello, world'

可以看到，(?=hello) 表示匹配 hello 前面的位置，而 (?<=hello) 表示匹配 hello 后面的位置，然后被 , 替换。

Reference

https://segmentfault.com/q/1010000004651380
PCRE 表达式全集
Regex101 目前为止遇到最强的解释正则工具网站，其他的还有 RegExr

正则表达式预检查

名词 {#terms}

相关 API

举例说明

1. 普通的捕获型正则

2. 正向肯定预检查 Postive Lookahead

3. 反向肯定预检查 Positive Lookbehind

4. 正向否定预检查 Negative Lookahead

5. 反向否定预检查 Negative Lookbehind

总结

千位分隔符 Look Ahead Positive Assert {#example2}

关于正向/反向更典型的举例

Reference