正则表达式简明教程！

简介和实例

正则表达式(regular expression)描述了一种字符串匹配的模式（pattern），可以用来提取一大段字符串中，含有的特定格式子字符串。正则表达式是由普通字符以及特殊字符组成的文字模式。

1、提取数字部分

# 从字符串"abc123def"中，提取出数字部分
var str = "abc123def";
var patt1 = /[0-9]+/;
document.write(str.match(patt1));

# 输出结果：123

2、找出相邻且相同单词

# Is is the cost of of gasoline going up up?
# 找出上面字符串中所有相邻两个单词是相同的字符(不区分大小写)

var str = "Is is the cost of of gasoline going up up";
var patt1 = /\b([a-z]+) \1\b/ig;
document.write(str.match(patt1));

# 结果
Is is
of of
up up

# 说明
两个\b表明了一个单词边界；
[a-z]+ 表示一个单词；
([a-z]+) 会匹配字符串中所有单词并存储起来；
 \1 表示访问上面存储的第一个单词；

3、url识别

var str = "http://www.runoob.com:80/html/html-tutorial.html";
var patt1 = /(\w+):\/\/([^/:]+)(:\d*)?([^# ]*)/;
arr = str.match(patt1);
for (var i = 0; i < arr.length ; i++) {
    document.write(arr[i]);
    document.write("<br>");
}

4、正则表达式的两种使用方式

<!DOCTYPE html>
<html>

<head>
    <meta charset="utf-8">
    <title>smallpdf.cn</title>
</head>

<body>

    <script>
        // (patt1 等同于 patt2)正则表达式的两种使用方式
        var str = "Is is the cost of of gasoline going up up";
        var patt1 = /\b([a-z]+) \1\b/ig;
        document.write("实例1：", str.match(patt1));

        document.write("<br><br>");
        var patt2 = new RegExp("\\b([a-z]+) \\1\\b", "ig")
        document.write("实例2："+str.match(patt2));

    </script>

</body>

</html>

5、全局与非全局匹配

<!DOCTYPE html>
<html>

<head>
    <meta charset="utf-8">
    <title>smallpdf.cn</title>
</head>

<body>

    <script>

        var str = "Google smallpdf.cn taobao smallpdf.cn";
        var n1 = str.match(/smallpdf.cn/);   // 查找第一次匹配项
        var n2 = str.match(/smallpdf.cn/g);  // 查找所有匹配项

        document.write("实例1：", n1);
        document.write("<br><br>");
        document.write("实例2：", n2);

    </script>

</body>

</html>

6、匹配E-Mail(邮箱)

<!DOCTYPE html>
<html>

<head>
    <meta charset="utf-8">
    <title>smallpdf.cn</title>
</head>

<body>

    <script>
        var str = "abcd test@runoob.com 1234";
        var patt1 = /\b[\w.%+-]+@[\w.-]+\.[a-zA-Z]{2,6}\b/g;
        document.write(str.match(patt1));
    </script>

</body>

</html>

7、动手验证实践一下

动手实践一下 >>

正则语法

1、定位符

定位符可将正则表达式固定在：行首、行尾、一个单词内、单词开头、单词结尾。定位符不能跟限定符一起使用，譬如：^* 这是错误的，因为一个字符串只有1个开始，不存在0个或多个开始。

正则	含义	字符串	正则表达式	结果
`^`	表示字符串的开始	"An E"	`/^A/`	'A'
`$`	表示字符串的结束	"eat"	`/t$/`	't'
`\b`	单词的前后边界	“moon”	`/\bm/`	‘m’（查找m开头单词）
`\B`	单词的非边界部分	"noonday"	`/\Boo/`	'oo'(单词中包含oo且不在单词边界)
`/`	正则表达式的终止符
`\`	转义符，转义后面跟的字符

2、普通字符

正则	含义	字符串	正则表达式	结果
`\d`	匹配一个数字，等价于`[0-9]`	"B2 is the suite number."	`/\d/`	'2'
`\D`	匹配一个非数字字符，等价于`[^0-9]`	"B2 is the suite number."	`/\D/`	'B'
`\w`	匹配一个字符（数字、字母、下划线），等同 `[A-Za-z0-9_]`。	"apple,"	`/\w/`	'a'
`\W`	匹配一个字符，等价于 `[^A-Za-z0-9_]`。	"50%."	`/\W/`	'%'
`\s`	匹配一个空白字符（空格、制表符、换页符、换行符）	"foo bar."	`/\s\w*/`	' bar'
`[\S]`	匹配一个非空白字符	"foo bar."	`/\S\w*/`	'foo'
`.`	匹配任一字符，换行符(\n、\r)除外，等同 `[^\n\r]`	"nay, an apple is on the tree"	`/.n/`	'an'、'on'
`[abc]`	匹配a、b、c中任一字符，`*`和`.`在括号内只表示字符本身，没有其他特殊意义	"asdfiobab"	`/[abc]/`	'a'、'b'、'a'、'b'
`[^abc]`	不包含a、b、c的所有字符
`[A-Z]`	匹配A到Z中任一字符
`[a-z]`	匹配a到z中任一字符
`[0-9]`	匹配0到9中任一数字

###### 3、限定符

正则	含义	字符串	正则表达式	匹配结果
`？`	匹配 0 或 1 次等同 `{0,1}`。	"angel"	`/e?le?/`	'el'
`*`	匹配 0 次或多次等同 `{0,}`	"<p>smallpdf.cn</p>"	`/<.*>/`	'<p>smallpdf.cn</p>'
`*?`	消除贪婪，匹配尽可能少	"<p>smallpdf.cn<p>"	`/<.*?>/`	'<p>' 和 '</p>'
`+`	匹配次数≥1，等同 `{1,}`	"<p>smallpdf.cn</p>"	`/<.+>/`	'<p>smallpdf.cn</p>'
`+?`	消除贪婪，匹配尽可能少	"<p>smallpdf.cn</p>"	`/<.+?>/`	'<p>' 和 '</p>'
`{n}`	n是正整数，匹配次数 = n
`{n,}`	n是正整数，匹配次数 ≥ n
`{n,m}`	n和m都是整数 n ≤ 匹配次数 ≤ m n或m为0，忽略

4、逻辑运算

正则	含义	字符串	正则表达式	匹配结果
`x	y`	匹配 x 或 y	"red apple"	`/green	red/`	'red'
`(x)`	匹配x，并存储匹配值， `\数字` 来访问存储值， `\1`是指第一个存储值。	看下面实例
`\num`	返回第num个缓存值，num是整数从1开始。	"apple, orange, cherry, peach."	`/apple(,)\sorange\1/`	'apple, orange,'
`(?:x)`	匹配x，但不存匹配字符， `industry	industries `<br>=` industr(?:y	ies) `
`x(?=y)`	匹配 x 且后面是 y，不存匹配值	"JackSpa"	`/Jack(?=Spa)/`	'Jack'
`x(?!y)`	匹配 x 且后面不是 y，不存匹配值	"JackSp"	`/Jack(?!Spa)/`	'Jack'
`(?<=y)x`	匹配 x 且前面是 y，不存匹配值	"JackSpa"	`/(?<=Jack)Spa/`	'Spa'
`(?<!y)x`	匹配 x 且前面不是 y ，不存匹配值	"JacSpa"	`/(?<!Jack)Spa/`	'Spa'

5、非打印字符

正则	含有
`[\b]`	匹配一个退格(U+0008)
`\f`	匹配一个换页符 (U+000C)
`\n`	匹配一个换行符 (U+000A)
`\r`	匹配一个回车符 (U+000D)
`\t`	匹配一个水平制表符 (U+0009)
`\v`	匹配一个垂直制表符 (U+000B)
`\0`	匹配 NULL（U+0000）字符，不要在这后面跟其它小数，因为 `\0<digits>` 是一个八进制转义序列。
`\xhh`	匹配一个两位十六进制数（\x00-\xFF）表示的字符
`\uhhhh`	匹配一个四位十六进制数表示的 UTF-16 代码单元
`\u{hhhh}`	匹配一个十六进制数表示的 Unicode 字符

6、模式设定

正则	含有
`g`	表示全局搜索选项或标记，将在整个字符串查找并返回所有匹配结果。
`i`	表示不区分大小写
`m`	多行搜索
`s`	允许 `.` 匹配换行符
`u`	使用unicode码的模式进行匹配
`y`	执行“粘性(`sticky`)”搜索,匹配从目标字符串的当前位置开始。

7、运算符的优先级

正则表达式从左到右进行计算，优先级高的先运算，相同优先级从左到右进行，下表从上之下，优先级依次递减，同一行优先级相同：

正则原算法
`\`
`()` `[]`
`^` `$` `\`
`	`