3

JavaScript has two ways to create regular expressions

 > var regexConst = new RegExp('abc'); //构造函数方式
> var regexLiteral = /abc/; //字面量方式

One thing to note, the literal declaration is declared in //, and the parameter passed in the constructor declaration is a string, all encountering some escape characters like \n, need double escape, become \n, below Some examples are listed:

image.png

Regular expression related functions

test(): The method provided by the RegExp object. Retrieves the value specified in the string. Returns true if the match is successful, otherwise returns false

 var regex = /hello/;
var str = 'hello world';
var result = regex.test(str);
console.log(result);

exec(): The method provided by the RegExp object. Retrieves the value specified in the string. Returns an array if the match is successful, and returns null if the match fails

 var reg=/hello/;
console.log(reg.exec('hellojs'));//['hello']
console.log(reg.exec('javascript'));//null

match(): The method provided by the character object. Retrieves the specified value in the string, if the match is successful, it returns the array storing the match result, otherwise it returns null. One thing to note here, if the global match g is not set, the returned array will only store the first successful match.

 var reg1=/javascript/i;
var reg2=/javascript/ig;
console.log('hello Javascript Javascript Javascript'.match(reg1));
//['Javascript']
console.log('hello Javascript Javascript Javascript'.match(reg2));
//['Javascript','Javascript','Javascript']

search(): A method provided by character objects. Retrieves the specified value in the string, if the match is successful, it returns the starting position of the first string segment that matches successfully, otherwise it returns -1.

 var reg=/javascript/i;
console.log('hello Javascript Javascript Javascript'.search(reg));//6

replace(): The method provided by the character object. Replaces substrings matching the regular expression and returns the replaced string. When the global match g is not set, only the first matching string fragment is replaced.

 var reg1=/javascript/i;
var reg2=/javascript/ig;
console.log('hello Javascript Javascript Javascript'.replace(reg1,'js'));
//hello js Javascript Javascript
console.log('hello Javascript Javascript Javascript'.replace(reg2,'js'));
//hello js js js

split(): The method provided by the character object. Split a string into an array of strings.

 var reg=/1[2,3]8/;
console.log('hello128Javascript138Javascript178Javascript'.split(reg));
//['hello','Javascript','Javascript178Javascript']

Simplest regular expression

 var regex = /hello/;
console.log(regex.test('hello world'));
// true

modifier

There are three kinds: i, g, m, which can appear at the same time without order (ie gi is the same as ig), please refer to the description of the modifier below:
i ignore case matching
g Global matching, that is, after matching one, continue matching until the end
m Multi-line matching, that is, it does not stop matching after encountering a newline, until the end

example:

 'abc'.match(/abc/); //['abc']
'abcd'.match(/abc/); //['abc'] 没有/g只返回第一个成功匹配的值
'abcdabc'.match(/abc/g); //['abc', 'abc'] 
'abCabc'.match(/abc/gi); //['abc', 'abc']
'abC\nabc'.match(/abc/gim); //['abc', 'abc']  \n是换行符

Square brackets [] usage

image.png

Some metacharacter descriptions

image.png

example:

 'aBcd efg'.match(/[a-z]+/); // ["a"] 
'aBcd efg'.match(/[a-z]+/i); // ["aBcd"] 
'aBcd efg'.match(/[a-z]+/g); // ["a", "cd", "efg"] 
'aBcd efg'.match(/[a-z]+/gi); // ["aBcd", "efg"] 
'aB\ncd\n efg'.match(/^[a-z]+/m); // ["a"] 
'aB\ncd\n efg'.match(/^[a-z]+/g); // ["a"] 
'aB\ncd\n efg'.match(/^[a-z]+/gm); // ["a", "cd"] // 注意不是 ["a", "cd", "efg"]
'adobe 2016'.match(/\d+|[a-z]+$/g); // ["2016"]
'adobe'.match(/\d+|[a-z]+$/g); // ["adobe"]
'adobe2016ps'.match(/\d+|^[a-z]+/g); // ["adobe", "2016"]

To summarize the usage of ^:

  • Start position within []:

     'adobe 2016'.match(/^[a-zA-Z]+/); // ["adobe"]
  • At the beginning of the regular double slash:

     'adobe'.match(/[^abc]/g); // ["d", "o", "e"]
  • Use with the | character:

     'adobe2016ps'.match(/\d+|^[a-z]+/g); // ["adobe", "2016"]
  • In other locations:

     '12a^eee'.match(/a\^/g); // ['a^']

Represents all letters, case-insensitive:

 'adobe-PS'.match(/[a-z]/gi); // ["a", "d", "o", "b", "e", "P", "S"]
'adobe-PS'.match(/[a-zA-Z]/g); // ["a", "d", "o", "b", "e", "P", "S"]

some hidden concepts

Most of the regex matches a single character:

 'adobe 2016'.match(/[a-z]/g) //["a", "d", "o", "b", "e"]

A small number of matches result in strings instead of single characters:

 'aBcd efg'.match(/[a-z]+/i); // ["aBcd"]

The two rules are next to each other, indicating that when matched, the two characters are connected:

 'adobe-2016'.match(/[a-g\-]/g); // ["a", "d", "b", "e", "-"] 
// 对连字符 - 本身进行匹配,需要用反斜线转义
'addo2-ado12os3'.match(/o\d/g); //['o2', 'o1']

The concept of consuming characters: the following example is not ["adobw12px", "ps15test"], because the two characters ps have been consumed

 'adobe12ps15test'.match(/[a-z]+\d+[a-z]+/); // ["adobe12ps"]

The matching pattern is greedy by default, matching as much of the searched string as possible:

 'aBcd efg'.match(/[a-z]+/gi);// ["aBcd", "efg"]
'a3 aaa12bb aaaaaaa34'.match(/a{2,4}\d+/g); 
// ["aaa12", "aaaa34"]  a{2,4}:2到4个a,大括号的用法后面会细讲

characters with special meaning

  • . matches any single character, except newlines and terminators

     '1+0.2*2=1.4'.match(/.{2}/g);
      // ["1+", "0.", "2*", "2=", "1."]
  • \w matches any word character (number, letter, underscore), equivalent to [A-Za-z0-9_]

     'ad34~!@$ps'.match(/\w/g);
      // ["a", "d", "3", "4", "p", "s"]
  • \W matches any word character, opposite to \w, equivalent to 1

     'ad34~!@$ps'.match(/\W/g);
      // ["~", "!", "@", "$"]
  • \d matches digits, equivalent to [0-9]

     'ps6'.match(/\d/g);
     // ["6"]
  • \D matches non-digits, equivalent to [0-9]

     'ps6'.match(/\D/g);
      // ["p", "s"]
  • \s matches whitespace characters, mainly (\n, \f, \r, \t, \v). Note that \s in 'a\sb' is still the character s, so 'a\sb'.match(/ \s/g) returns null

     'adobe ps'.match(/\s/g);
      // [" "]
  • \S matches non-whitespace characters, the opposite of \s

     'adobe ps'.match(/\S/g);
      // ["a", "d", "o", "b", "e", "p", "s"]
  • \b matches word boundaries, note that a string of consecutive numbers, letters or underscores will be considered a word

     'adobe(2016) ps6.4'.match(/\b(\w+)/g);
      // ["adobe", "2016", "ps6", "4"]
  • \B matches non-word boundaries, carefully understand the result of the following example and \b

     'adobe(2016) ps6.4'.match(/\B(\w+)/g);
      // ["dobe", "016", "s6"]
  • \0 matches NUL characters

     '\0'.match(/\0/);
     // ["NUL"]
  • \n matches newline (encoding: 10, newline)

     'adobe\nps'.match(/\n/).index;
      // 5
  • \f matches a form feed

     'adobe\fps'.match(/\f/).index;
      // 5
  • \r matches carriage return (encoding: 13, return)

     'adobe\rps'.match(/\r/).index;
      // 5
  • \t matches the tab character, the character corresponding to the keyboard tab

     'adobe\tps'.match(/\t/).index;
      // 5
  • \v matches vertical tabs

     'adobe\vps'.match(/\v/).index;
      // 5
  • \xxx matches the character specified by the octal number xxx

     'a'.charCodeAt(0).toString(8);
      // "141"
      'adobe ps'.match(/\141/g);
      // ["a"]
  • \xdd matches the character specified by the hexadecimal number dd

     'a'.charCodeAt(0).toString(16);
      // "61"
      'adobe ps'.match(/\x61/g);
      // ["a"]
  • \uxxxx matches the Unicode characters specified by the hexadecimal number xxxx. Note that if the number of digits is not enough, you need to add 0

     'a'.charCodeAt(0).toString(16);
      // "61"
      'adobe ps'.match(/\u0061/g);
      // ["a"]

    Quantifier description

  • n+ matches a string containing at least one n

     'adobe paas'.match(/a+\w+/g);
      // ["adobe", "aas"]
  • n* matches a string containing zero or more n

     'ab3 aa12bb'.match(/a*\d+/g);
      // ["3", "aa12"]
  • n? matches a string containing zero or one n

     'ab3 aa12bb'.match(/a?\d+/g);
      // ["3", "a12"]
  • n{x} matches a string containing x consecutive n

     'ab3 aa12bb aaa34'.match(/a{2}\d+/g);
      // ["aa12", "aa34"]
  • n{x,y} matches a string containing at least x consecutive and at most y consecutive n strings

     'a3 aaa12bb aaaaaaa34'.match(/a{2,4}\d+/g);
      // ["aaa12", "aaaa34"]
  • n{x,} matches a string containing at least x consecutive n

     'a3 aaa12bbaa4'.match(/a{2,}\d+/g);
      // ["aaa12", "aa4"]

It can be seen from the above that the following expression 1 is equivalent to expression 2
image.png

Parentheses () usage

  • The usage of grouping, that is, the usage of capture:
 'https://baidu.com'.match(/https:\/{2}\w+\.com$/g); //['https://baidu.com']
'https://baidu.com'.match(/(https):\/{2}\w+\.(com)$/g);//['https://baidu.com']

image.png

Capture means: can be captured by RegExp.$1

  • used with |
 'https://baidu.com'.match(/(http|https):\/{2}\w+\.(com|cn)$/g);
//['https://baidu.com']
  • Uncaptured usage cannot be captured by RegExp.$1

     'https://baidu.com'.match(/(?:http|https):\/{2}\w+\.(com|cn)$/g);
    //['https://baidu.com']

    image.png

Usage of backslashes

Slashes: / (left side of uppercase characters) need to be escaped, because regular literals are written with two slashes and backslashes: \ (right side of uppercase characters) do not need to be escaped

 //转义特殊字符的作用:
'11+2=13'.match(/\d+\+/g); 
//["11+"]
'(11+2)*2=26'.match(/\(\d+\+\d+\)/g); // ["(11+2)"]
//斜杠需要转义,反斜杠不需要转义:
'path C:\Windows\System32'.match(/([a-zA-Z]:\\\w+)/g); 
// null 
'path C:\\Windows\\System32'.match(/([a-zA-Z]:\\\w+)/g); 
// ["C:\\Windows"]
'https://baidu.com'.match(/(http|https):\/\/\w+\.(com|cn)$/g);
//['https://baidu.com']

Question mark (?) usage

image.png

  • (?:n ) means a non-capturing group

     // 不使用括号时
      'adobe12ps15test'.match(/[a-z]+\d+[a-z]+/);
      // ["adobe12ps"]
      // 使用括号分组
      'adobe12ps15test'.match(/[a-z]+(\d+)([a-z]+)/);
      // ["adobe12ps", "12", "ps"]
      'adobe12ps15test'.match(/[a-z]+(?:\d+)([a-z]+)/);
      // ["adobe12ps", "ps"]
      // 看起来上面语句不用(?:)也可以得到相同结果,即:
      'adobe12ps15test'.match(/[a-z]+\d+([a-z]+)/);
      // ["adobe12ps", "ps"]
    
      // 注意,但需求希望匹配字母之间的规则复杂时,如希望匹配字母,且字母之间可以为1或3时,但不需要1和3
      'adobe11ps15test'.match(/[a-z]+(1|3)+([a-z]+)/);
      // ["adobe11ps", "1", "ps"]
      // 返回中不希望包含数字怎么办,可以使用非捕获
      'adobe11ps15test'.match(/[a-z]+(?:1|3)+([a-z]+)/);
      // ["adobe11ps", "ps"]
  • (?=n ) matches any string immediately followed by the character n, but does not include n in the return

     'adobe12ps15test'.match(/[a-z]+(?=\d)/g);
      // ["adobe", "ps"]
  • (?!n ) matches any string not immediately followed by the character n, not including n in the return

     'adobe12ps15test'.match(/[a-z]+(?!\d)/g);
      // ["adob", "p", "test"]
  • (?<=n ) matches any string immediately preceded by the character n, excluding n

     'adobe12ps15test'.match(/(?<=\d)[a-z]+/g);
      // ["ps", "test"]
  • (?<!n ) matches any string immediately preceded by the character n, not including n in the return

     'adobe12ps15test'.match(/(?<!\d)[a-z]+/g);
      // ["adobe", "s", "est"]

operator precedence

image.png

Common Regular Expressions

 1.由数字、26个英文字母或者下划线组成的字符串:
    ^[0-9a-zA-Z_]{1,}$ 
2.非负整数(正整数 + 0 ):
    ^/d+$
3. 正整数:
    ^[0-9]*[1-9][0-9]*$
4.非正整数(负整数 + 0):
    ^((-/d+)|(0+))$
5. 负整数 :
    ^-[0-9]*[1-9][0-9]*$
6.整数:    
    ^-?/d+$
7.非负浮点数(正浮点数 + 0):
    ^/d+(/./d+)?$
8.正浮点数 :
    ^(([0-9]+/.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*/.[0-9]+)|([0-9]*[1-9][0-9]*))$
9. 非正浮点数(负浮点数 + 0):
    ^((-/d+(/./d+)?)|(0+(/.0+)?))$
10.负浮点数 :
    ^(-(([0-9]+/.[0-9]*[1-9][0-9]*)|([0-9]*[1-9][0-9]*/.[0-9]+)|([0-9]*[1-9][0-9]*)))$
11. 浮点数 :
    ^(-?/d+)(/./d+)?$
12.由26个英文字母组成的字符串 :    
    ^[A-Za-z]+$
13. 由26个英文字母的大写组成的字符串 :
    ^[A-Z]+$
14.由26个英文字母的小写组成的字符串 :
    ^[a-z]+$
15. 由数字和26个英文字母组成的字符串 :
    ^[A-Za-z0-9]+$
16.由数字、26个英文字母或者下划线组成的字符串 :    
    ^/w+$
17.email地址 :
    ^[/w-]+(/.[/w-]+)*@[/w-]+(/.[/w-]+)+$
18.url:    
    ^[a-zA-z]+://(/w+(-/w+)*)(/.(/w+(-/w+)*))*(/?/S*)?$
19. 年-月-日:
    /^(d{2}|d{4})-((0([1-9]{1}))|(1[1|2]))-(([0-2]([1-9]{1}))|(3[0|1]))$/
20.月/日/年:
    /^((0([1-9]{1}))|(1[1|2]))/(([0-2]([1-9]{1}))|(3[0|1]))/(d{2}|d{4})$/
21.Emil:
    ^([w-.]+)@(([[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.)|(([w-]+.)+))([a-zA-Z]{2,4}|[0-9]{1,3})(]?)$
22. 电话号码:
    (d+-)?(d{4}-?d{7}|d{3}-?d{8}|^d{7,8})(-d+)?
23.IP地址:
    ^(d{1,2}|1dd|2[0-4]d|25[0-5]).(d{1,2}|1dd|2[0-4]d|25[0-5]).(d{1,2}|1dd|2[0-4]d|25[0-5]).(d{1,2}|1dd|2[0-4]d|25[0-5])$
24. 匹配中文字符的正则表达式:
    [/u4e00-/u9fa5]
25.匹配双字节字符(包括汉字在内):
    [^/x00-/xff]
26. 匹配空行的正则表达式:
    /n[/s| ]*/r
27.匹配HTML标记的正则表达式:
    /<(.*)>.*<///1>|<(.*) //>/
28.匹配首尾空格的正则表达式:
    (^/s*)|(/s*$)
29.匹配Email地址的正则表达式:
    /w+([-+.]/w+)*@/w+([-.]/w+)*/./w+([-.]/w+)*
30. 匹配网址URL的正则表达式:
    ^[a-zA-z]+://(//w+(-//w+)*)(//.(//w+(-//w+)*))*(//?//S*)?$
31. 匹配帐号是否合法(字母开头,允许5-16字节,允许字母数字下划线):
    ^[a-zA-Z][a-zA-Z0-9_]{4,15}$
32. 匹配国内电话号码:
    (/d{3}-|/d{4}-)?(/d{8}|/d{7})?
33.匹配腾讯QQ号:
    ^[1-9]*[1-9][0-9]*$
34. 只能输入数字:
    ^[0-9]*$
35.只能输入n位的数字:
    ^/d{n}$
36.只能输入至少n位的数字:
    ^/d{n,}$
37.只能输入m~n位的数字:
    ^/d{m,n}$
38.只能输入零和非零开头的数字:
    ^(0|[1-9][0-9]*)$
39.只能输入有两位小数的正实数:
    ^[0-9]+(.[0-9]{2})?$
40. 只能输入有1~3位小数的正实数:
    ^[0-9]+(.[0-9]{1,3})?$
41.只能输入非零的正整数:
    ^/+?[1-9][0-9]*$
42. 只能输入非零的负整数:
    ^/-[1-9][0-9]*$
43.只能输入长度为3的字符:
    ^.{3}$
44. 只能输入由26个英文字母组成的字符串:
    ^[A-Za-z]+$
45.只能输入由26个大写英文字母组成的字符串:
    ^[A-Z]+$
46. 只能输入由26个小写英文字母组成的字符串:
    ^[a-z]+$
47.只能输入由数字和26个英文字母组成的字符串:
    ^[A-Za-z0-9]+$
48. 只能输入由数字和26个英文字母或者下划线组成的字符串:
    ^/w+$
49.验证用户密码(正确格式为: 以字母开头,长度在5~17 之间,只能包含字符、数字和下划线)
    ^[a-zA-Z]/w{5,17}$
50.验证是否包含有 ^%&',;=?$/"等字符:
    [^%&',;=?$/x22]+
51.只能输入汉字:
    ^[/u4e00-/u9fa5]{0,}$
52、只含有汉字、数字、字母、下划线不能以下划线开头和结尾
    ^(?!_)(?!.*?_$)[a-zA-Z0-9_/u4e00-/u9fa5]+$
53、只含有汉字、数字、字母、下划线,下划线位置不限
    ^[a-zA-Z0-9_/u4e00-/u9fa5]+$
54、2~4个汉字
    @"^[/u4E00-/u9FA5]{2,4}$

Reference study materials:
https://cloud.tencent.com/developer/article/1498442
https://www.runoob.com/regexp/regexp-metachar.html


  1. A-Za-z0-9_

前端小七
29 声望1 粉丝

« 上一篇
跨域解决方案
下一篇 »
移动端适配