I have always claimed to be proficient in regular expressions (after all, I have implemented a regular engine), but I have never been familiar with a few special uses of regular expressions, one is that I use less, and the other is that I didn’t even bother to learn. . Just in the past two days, there is a need to use the pre-check function, so I simply learned to sort out today's blog post.
Introduce today's 4 protagonists ?=, ?<=,?!,?<! , it is estimated that most of them will be face-to-face. We know that regular expressions in the second grade of elementary school are used to make the string match , and the core is that matches . The regular expressions we have seen in the past are directly matched to some content, and ?=,?<=,?!,?<! are only auxiliary matching, they themselves will not match Any content, like this kind of regular expression, we also call zero-width assertion , their meaning is only for positioning.
For example, it will be clear to everyone. Suppose you ask you to find a with a white cover of more than 200 pages on the bookshelf. This sentence is used to match the book pattern. more than 200 pages of novel with a white cover on the bookshelf. I want to further narrow the scope. I want on the "compiler theory" to the left of the white cover of the novel 200 pages , here I mentioned the "compiler theory" to find the book, but I do not want it, it's just play positioning role . The ones that play a role in positioning but do not match in regular expressions are the ?=, ?<=,?!,?<! that I will talk about today.
?=
Let's take a look at the usage of these regular grammars one by one. The first is ?= , and the usage is exp1(?=exp2) . Find the exp1 and exp2 that appear before exp2 and will not appear in the result, as follows picture.
Here I deliberately use the words fiction and compiler, fiction and compilation. There are two fictions in the string, which are on the left and right sides of the compiler. fiction(?=compiler)
only matches the first fiction, and (?=compiler)
has positioning restrictions on it. And the example cited above is to find it on the corresponding on the "compiler theory" to the left of fiction .
?!
?! and ? = a pair, ?! is ? = negative semantics, usage is EXP1 (?! Exp2) , its meaning is not appear in front of exp2 of exp1. We directly change the ?= above figure to ?! , then it will only match the fiction on the right, corresponding to the fiction not on the left of the novel .
?! and ? = in accordance with the pattern on the right side to locate, regular expression as a mature design tool, must also have the corresponding positioning of the left, that is ? <= and ? < ! , similarly they are also a pair.
?<=
? <= and =? usage happens in turn, <= match to be placed in front of the content, such as:? EXP1 (<= EXP2?) , whose role is to match EXP2 back EXP1 , we Take compiler and fiction as examples. This time we change the position of compiler and fiction in the string, and the regular expression is also changed to ?<= , and its role becomes to find on the right side of "Compilation Principles" Novel
?<!
? <! is ? <= negative mode, use the same ? <= , (? <! Exp2) EXP1 , whose role is to match not behind exp2 of EXP1 , I will not use To repeat, look directly at the picture, the regular expression did not match the first fiction, but matched the second fiction.
Concluding remarks
Regular expressions are an extremely useful tool. In my personal experience, proficient in regular expressions can improve daily work a lot, such as clear logs, simple statistics... Regular expressions plus other linux command line tools, you can The efficiency is very much improved. To give a less serious example, for example, if I want to next American TV series, dozens of episodes on the video website are all separated links. Normal people are probably copying and pasting to the downloader, repeating it more than 20 times, which is not only troublesome, but also possible omissions or omissions. repeat. My operation is to open the source code of the webpage and once the regular expression matches, copy and paste in batches, and it's done.
In addition, regular expressions are also very interesting tools. If you don’t believe me, you can read some related blogs I wrote before.
uses regular expressions to detect whether a number is prime
Use regular expressions to match any multiple of 3
hand-
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。