2
头图

Summary

正则表达式是一项优秀的技术,更是一门伟大的科学。这篇博客我将大概介绍一下它的历史,详细的讨 
论一下它的语法,最后说一下它在 JavaScript 与 Linux 中的应用。
快速阅读步骤:定义 --> PATTERN --> JavaScript 中的 RegExp / Linux 中的 grep 命令  

Keyword

Regular Expression | Perl | JavaScript | Linux

story

20 1940 : Regular Expressions original idea came from two neurologists: McCulloch and Walter Pitts, they developed a mathematical model to describe the way the neural network.

1956 Nian : mathematical scientist named Stephen Kleene's published a paper titled "Neural Networks events of representation," the use of mathematical notation called regular sets of to describe this model, the introduction of a regular expression the concept of. Regular expressions are used to describe an expression called "algebra of regular sets", so the term "regular expression" is adopted.

1968 Nian : the father of the C language, UNIX Ken Thompson, the father of the "regular expression" theoretical results for the search algorithm to do some research, he describes a regular expression compiler, so there It should be regarded as the earliest regular expression compiler qed (this also became the later grep editor).

After Unix uses regular expressions, regular expressions have continued to develop and grow, and then applied to various fields on a large scale. According to their respective conditions and needs, many versions of regular expressions have been developed, and many branches have emerged. We call these branches "genres".

1987 Nian : Perl language born, which combines the other's language, using regular expressions as a basis for creating a new genre, Perl genre. After that, many programming languages such as Python, Java, Ruby, .Net, PHP, etc. refer to Perl regular expressions when designing regular expressions.

Definition and principle

definition

Regular expressions are a a special character and text character written mode , some of the characters does not mean its literal sense, but is used to represent control or wildcard function.

Principle - regular expression engine

Regular Expression engine

Here you can reference gentlemen of this blog

PATTERN

A regular expression commonly referred to as a mode , is ( second acoustic ) used describe or matches a series match syntactic rules of string .

Metacharacter

Regular expressions consist of two basic character types: literal text characters and metacharacters . Metacharacters make regular expressions capable of processing. The so-called metacharacters refer to those with special meaning of in regular expressions. 160d8452cbde06 can be used to that its leading character (the character before the metacharacter) appears in the target object.

Character match

 .   :匹配任意单个字符;
 []  :匹配指定范围内的任意单个字符;
 [^] :匹配指定范围外的任意单个字符;

Matches

Used after the character whose number of occurrences are to be specified, and used to limit the number of occurrences of the preceding character.
*    :匹配其前面的字符任意次(0次、1次或多次);
.*   :匹配任意长度的任意字符;
?    :匹配其前面的字符0次或1次,即其前面的字符可有可无;
+    :匹配其前面的字符1次或多次,即前面的字符至少要出现1次;
{m}  :匹配其前面的字符 m 次;
{m,n}:匹配其前面的字符至少 m 次,至多 n 次;
{0,n}:匹配其前面的字符至多 n 次;
{m,} :匹配其前面的字符至少 m 次;

Location anchoring

^             :行首锚定,用于模式的最左侧;
$             :行尾锚定,用于模式的最右侧;
^PATTERN$     :用 PATTERN 来匹配整行;
^$            :匹配空白行;
^[[:space:]]*$:匹配空行或包含空白字符的行;
单词          :非特殊字符组成的连续字符或字符串都称为单词;
\< 或 \b      :词首锚定,用于单词模式的左侧;
\> 或 \b      :词尾锚定,用于单词模式的右侧;
\<PATTERN\>   :匹配完整单词;

Grouping and references

()     :将一个或多个字符捆绑在一起,当作一个整体进行处理;
\1     :模式从左侧起,第一个左括号以及与之匹配的右括号之间的模式所匹配到的字符;
\2     :模式从左侧起,第二个左括号以及与之匹配的右括号之间的模式所匹配到的字符;
后向引用:引用前面的分组括号中的模式所匹配到的字符;

Special metacharacters

\b :匹配一个词的边界;
\d :匹配一个数字,等价于 [0-9];
\D :匹配一个非数字,等价于 [^0-9];
\n :匹配一个换行符;
\r :匹配一个回车符;
\s :匹配一个空白字符;
\t :匹配一个制表符;
\w :匹配一个单字字符,等价于 [A-Za-z0-9];

RegExp in JavaScript

ECMAScript supports regular expressions RegExp type

Definition method

JavaScript has two ways to define regular expressions.

Literal form

var expression = / pattern / flags ;

其中的 pattern 就不再赘述,说一说 flags 标志位:
g :global,全局模式,会匹配所有的字符串;
i :case-insensitive,忽略大小写;
m :multiline,多行模式,到达一行末尾时还会继续查找下一行;

RegExp constructor

var expression = new RegExp("pattern", "flags") ;

注意:传递给 RegExp 构造函数的两个参数都是字符串;

RegExp instance attributes

var pattern1 = /\[bc\]at/i;
alert(pattern1.global);
alert(pattern1.ignorecase);
alert(pattern1.multiline);
alert(pattern1.lastIndex);
alert(pattern1.source);

var pattern2 = new RegExp("\\[bc\\]at", "i");
alert(pattern1.global);
alert(pattern1.ignorecase);
alert(pattern1.multiline);
alert(pattern1.lastIndex);
alert(pattern1.source);

RegExp instance method

exec()

This method receives a to be applied as a parameter , and then returns containing the information of the first match.
var text = "mom and dad and baby";
var pattern1 =  /mom ( and dad ( and baby)?)?/gi;

var matches = pattern1.exec(text);
// ["mom ", undefined, undefined, index: 0, input: "mom and dad and baby", groups: undefined]
alert(matches.index)
alert(matches.input)
alert(matches[0])
alert(matches[1])
alert(matches[2])

For the exec() method, even if the global flag is set in the mode, it will only each time a match .

test()

This method receives a string parameter true if the pattern matches this parameter, otherwise it returns false .
var text = "000-00-0000";
var pattern = /\d{3}-\d{2}-\d{4}/;

if (pattern.test(text)) {
  alert("The pattern was matched.")
}

RegExp constructor properties

The static properties of RegExp. These properties apply to all regular expressions in the , and based on the last regular expression operation performed .
input        $_  :最近一次要匹配的字符串;
lastMatch    $&  :最近一次的匹配项;
lastParen    $+  :最近一次匹配的捕获组;
leftContext  $`  :input 字符串中 lastMatch 之前的文本;
multiline    $*  :布尔值,表示是否所有表达式都使用多行模式;
rightContext $'  :input 字符串中 lastMatch 之后的文本;
var text = "this has been a short summer";
var pattern = /(.)hort/g;

if (pattern.test(text)) {
  alert(RegExp.input);           // this has been a short summer
  alert(RegExp.leftContext);     // this has been a
  alert(RegExp.rightContext);    // summer
  alert(RegExp.lastMatch);       // short
  alert(RegExp.lastParen);       // s
  alert(RegExp.multiline);       // false
}

The grep command in Linux

Global search Regular Expression and Print out the line.
grep [OPTIONS] [-e PATTERN | -f FILE] [FILE...]

OPTIONS:
  --color=auto
  -i   :忽略大小写;
  -o   :仅显示匹配到的字符串本身;
  -v   :显示不被模式匹配到的行;
  -E   :支持使用扩展的正则表达式元字符;
  -q   :静默模式;
  -A # :后 # 行;
  -B # :前 # 行;
  -C # :前后 # 行;

Thanks

  • Wikipedia
  • Baidu Encyclopedia
  • Linux Master Marco
  • Red Book
  • Pig brother

贤儒
9 声望0 粉丝