Preface
" public account. Maybe you have never met before, but it is very likely that you are too late to meet.
Digit thousands division, cell phone number 3-3-4 format splicing, trim function implementation, HTML escaping, obtaining url query parameters...Do you often encounter it in interviews and work? Let's take a look at how to use regular rules to catch them all in one go! ! !
1. Digital price per thousand divisions
Turn 123456789 into 123,456,789
question is estimated to be frequently encountered in interviews and work, and it appears more frequently.
Regular result
'123456789'.replace(/(?!^)(?=(\d{3})+$)/g, ',') // 123,456,789
supplements small thousands of quantile support
Analysis process
The title probably means:
- From back to
add a comma before every three digits
- Do not add a comma at the beginning (for example:
123
cannot become,123
)
Does it fit the rule of (?=p)? p can represent every three digits, and the position of the comma to be added is exactly the position matched by (?=p).
first step, try to get the first comma out
let price = '123456789'
let priceReg = /(?=\d{3}$)/
console.log(price.replace(proceReg, ',')) // 123456,789
second step, get all the commas out
To get all the commas out, the main problem to be solved is how to represent a group of three numbers , which is a multiple of 3. We know that regular square brackets can turn a p pattern into a small whole, so using the characteristics of brackets, you can write like this
let price = '123456789'
let priceReg = /(?=(\d{3})+$)/g
console.log(price.replace(priceReg, ',')) // ,123,456,789
third step, remove the first comma,
The above has basically fulfilled the requirements, but it is not enough. There will be a comma in the first place. How to remove the first comma? Think about whether there is a knowledge that satisfies this scene? That's right (?! p), it's him. The combination of the two is to add a comma before every three digits from the back to the front, but this position cannot be the first ^.
let price = '123456789'
let priceReg = /(?!^)(?=(\d{3})+$)/g
console.log(price.replace(priceReg, ',')) // 123,456,789
2. Mobile phone number 3-4-4 split
Convert the phone number 18379836654 to 183-7983-6654
form collection scene, cell phone formatting frequently encountered
Regular result
let mobile = '18379836654'
let mobileReg = /(?=(\d{4})+$)/g
console.log(mobile.replace(mobileReg, '-')) // 183-7983-6654
Analysis process
With the above number of thousandths division method, I believe it will be much easier to do this question, that is, to find such a position from the back to the front:
The position before every four digits, and replace this position with-
let mobile = '18379836654'
let mobileReg = /(?=(\d{4})+$)/g
console.log(mobile.replace(mobileReg, '-')) // 183-7983-6654
3. Mobile phone number 3-4-4 split extension
To convert the mobile phone number 18379836654 to 183-7983-6654, the following conditions need to be met
- 123 => 123
- 1234 => 123-4
- 12345 => 123-45
- 123456 => 123-456
- 1234567 => 123-4567
- 12345678 => 123-4567-8
- 123456789 => 123-4567-89
- 12345678911 => 123-4567-8911
Think about it, this is actually the process that we often encounter when users enter their mobile phone numbers, which require constant formatting.
Regular result
const formatMobile = (mobile) => {
return String(mobile).slice(0,11)
.replace(/(?<=\d{3})\d+/, ($0) => '-' + $0)
.replace(/(?<=[\d-]{8})\d{1,4}/, ($0) => '-' + $0)
}
console.log(formatMobile(18379836654))
Analysis process
It is not appropriate to use (?=p) here, for example, 1234 will become -1234. We need to find another way,
Are there other knowledge points in the regular rules that are convenient for dealing with this kind of scene? There is (?<=p)
first step is to get the first one-
const formatMobile = (mobile) => {
return String(mobile).replace(/(?<=\d{3})\d+/, '-')
}
console.log(formatMobile(123)) // 123
console.log(formatMobile(1234)) // 123-4
the second-out
Then we came up with the second one, the second one-exactly in the 8th position (1234567-).
const formatMobile = (mobile) => {
return String(mobile).slice(0,11)
.replace(/(?<=\d{3})\d+/, ($0) => '-' + $0)
.replace(/(?<=[\d-]{8})\d{1,4}/, ($0) => '-' + $0)
}
console.log(formatMobile(123)) // 123
console.log(formatMobile(1234)) // 123-4
console.log(formatMobile(12345)) // 123-45
console.log(formatMobile(123456)) // 123-456
console.log(formatMobile(1234567)) // 123-4567
console.log(formatMobile(12345678)) // 123-4567-8
console.log(formatMobile(123456789)) // 123-4567-89
console.log(formatMobile(12345678911)) // 123-4567-8911
4. Verify the legitimacy of the password
The password length is 6-12 digits and consists of numbers, lowercase letters and uppercase letters, but it must include at least 2 characters
Regular result
let reg = /(((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z]))^[a-zA-Z\d]{6,12}$/
console.log(reg.test('123456')) // false
console.log(reg.test('aaaaaa')) // false
console.log(reg.test('AAAAAAA')) // false
console.log(reg.test('1a1a1a')) // true
console.log(reg.test('1A1A1A')) // true
console.log(reg.test('aAaAaA')) // true
console.log(reg.test('1aA1aA1aA')) // true
Analysis process
The topic consists of three conditions
- Password length is 6-12 digits
- Consists of numbers, lowercase characters, and uppercase letters
- Must include at least 2 characters
first step is to write conditions 1 and 2 and the regular
let reg = /^[a-zA-Z\d]{6,12}$/
second step must contain certain characters (digits, lowercase letters, uppercase letters)
let reg = /(?=.*\d)/
// 这个正则的意思是,匹配的是一个位置
// 这个位置需要满足`任意数量的符号,紧跟着是个数字`,
// 注意它最终得到的是个位置而不是其他的东西
// (?=.*\d)经常用来做条件限制
console.log(reg.test('hello')) // false
console.log(reg.test('hello1')) // true
console.log(reg.test('hel2lo')) // true
// 其他类型同理
third step, write the complete regular
Must contain two characters, there are the following four permutations and combinations
- Combinations of numbers and lowercase letters
- Combinations of numbers and capital letters
- Combination of lowercase and uppercase letters
- Numbers, lowercase letters, and uppercase letters are combined together (but in fact, the first three have covered the fourth one)
// 表示条件1和2
// let reg = /((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))/
// 表示条件条件3
// let reg = /(?=.*[a-z])(?=.*[A-Z])/
// 表示条件123
// let reg = /((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z])/
// 表示题目所有条件
let reg = /(((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z]))^[a-zA-Z\d]{6,12}$/
console.log(reg.test('123456')) // false
console.log(reg.test('aaaaaa')) // false
console.log(reg.test('AAAAAAA')) // false
console.log(reg.test('1a1a1a')) // true
console.log(reg.test('1A1A1A')) // true
console.log(reg.test('aAaAaA')) // true
console.log(reg.test('1aA1aA1aA')) // true
5. Extract consecutive repeated characters
Extract duplicate characters, such as 1223454545666, extract ['23', '45', '6']
Regular result
const collectRepeatStr = (str) => {
let repeatStrs = []
const repeatRe = /(.+)\1+/g
str.replace(repeatRe, ($0, $1) => {
$1 && repeatStrs.push($1)
})
return repeatStrs
}
Analysis process
There are several key information in the title
- Consecutively repeated characters
- The length of the number of consecutively repeated characters is unlimited (for example, 23 and 45 are two digits, and 6 is one digit)
So what is continuous repetition?
11 is a continuous repeat, 22 is also a continuous repeat, and 111 is of course also. In other words, some characters must also be followed by X, which is called continuous repetition. If it is clear that X is 1, then /11+/
can also be matched, but the key is that X here is ambiguous, what should I do? .
Using back references can easily solve this problem.
that indicates that there is a repeated character.
// 这里的X可用.来表示,即所有的字符,并用括号进行引用,紧跟着反向应用\1,也就是体现了连续重复的意思啦
let repeatRe = /(.)\1/
console.log(repeatRe.test('11')) // true
console.log(repeatRe.test('22')) // true
console.log(repeatRe.test('333')) // true
console.log(repeatRe.test('123')) // false
that indicates that there are n characters repeated
Because it is not sure whether to match 11 or 45
45
+ is needed in the brackets to reflect n repeated characters, and the backreference itself can be more than one, for example, 45
45
45
let repeatRe = /(.+)\1+/
console.log(repeatRe.test('11')) // true
console.log(repeatRe.test('22')) // true
console.log(repeatRe.test('333')) // true
console.log(repeatRe.test('454545')) // true
console.log(repeatRe.test('124')) // false
third step is to extract all consecutively repeated characters
const collectRepeatStr = (str) => {
let repeatStrs = []
const repeatRe = /(.+)\1+/g
// 很多时候replace并不是用来做替换,而是做数据提取用
str.replace(repeatRe, ($0, $1) => {
$1 && repeatStrs.push($1)
})
return repeatStrs
}
console.log(collectRepeatStr('11')) // ["1"]
console.log(collectRepeatStr('12323')) // ["23"]
console.log(collectRepeatStr('12323454545666')) // ["23", "45", "6"]
6. Implement a trim function
Remove the leading and trailing spaces of the string
Regular result
// 去除空格法
const trim = (str) => {
return str.replace(/^\s*|\s*$/g, '')
}
// 提取非空格法
const trim = (str) => {
return str.replace(/^\s*(.*?)\s*$/g, '$1')
}
Analysis process
At first glance at the title, the way we flashed in our minds is to space part and keep the non-space part, but you can also change the way of thinking, and you can also extract the non-space part, regardless of the space part. Next, let's write about the implementation of the two trim methods
Method one, remove the space method
const trim = (str) => {
return str.replace(/^\s*|\s*$/g, '')
}
console.log(trim(' 前端胖头鱼')) // 前端胖头鱼
console.log(trim('前端胖头鱼 ')) // 前端胖头鱼
console.log(trim(' 前端胖头鱼 ')) // 前端胖头鱼
console.log(trim(' 前端 胖头鱼 ')) // 前端 胖头鱼
Method two, extraction of non-space method
const trim = (str) => {
return str.replace(/^\s*(.*?)\s*$/g, '$1')
}
console.log(trim(' 前端胖头鱼')) // 前端胖头鱼
console.log(trim('前端胖头鱼 ')) // 前端胖头鱼
console.log(trim(' 前端胖头鱼 ')) // 前端胖头鱼
console.log(trim(' 前端 胖头鱼 ')) // 前端 胖头鱼
7. HTML Escaping
One way to prevent XSS attacks is to do HTML escaping. The escaping rules are as follows, requiring the corresponding characters to be converted into equivalent entities. The reverse meaning is to convert the escaped entity into the corresponding character
character | Escaped entity | |
---|---|---|
& | & | |
< | < | |
> | > | |
" | " | |
' | ' |
Regular result
const escape = (string) => {
const escapeMaps = {
'&': 'amp',
'<': 'lt',
'>': 'gt',
'"': 'quot',
"'": '#39'
}
const escapeRegexp = new RegExp(`[${Object.keys(escapeMaps).join('')}]`, 'g')
return string.replace(escapeRegexp, (match) => `&${escapeMaps[match]};`)
}
Analysis process
Global match &
, <
, >
, "
, '
, just replace them according to the above table. When a character like this may be one of many situations, we generally use the character set to do that, that is,
[&<>"']
const escape = (string) => {
const escapeMaps = {
'&': 'amp',
'<': 'lt',
'>': 'gt',
'"': 'quot',
"'": '#39'
}
// 这里和/[&<>"']/g的效果是一样的
const escapeRegexp = new RegExp(`[${Object.keys(escapeMaps).join('')}]`, 'g')
return string.replace(escapeRegexp, (match) => `&${escapeMaps[match]};`)
}
console.log(escape(`
<div>
<p>hello world</p>
</div>
`))
/*
<div>
<p>hello world</p>
</div>
*/
8. HTML inversion
Regular result
Reverse meaning is just the reverse process, we can easily write
const unescape = (string) => {
const unescapeMaps = {
'amp': '&',
'lt': '<',
'gt': '>',
'quot': '"',
'#39': "'"
}
const unescapeRegexp = /&([^;]+);/g
return string.replace(unescapeRegexp, (match, unescapeKey) => {
return unescapeMaps[ unescapeKey ] || match
})
}
console.log(unescape(`
<div>
<p>hello world</p>
</div>
`))
/*
<div>
<p>hello world</p>
</div>
*/
9. Camelize strings
The following rules, change the corresponding string into camel case
1. foo Bar => fooBar
2. foo-bar---- => fooBar
3. foo_bar__ => fooBar
Regular result
const camelCase = (string) => {
const camelCaseRegex = /[-_\s]+(.)?/g
return string.replace(camelCaseRegex, (match, char) => {
return char ? char.toUpperCase() : ''
})
}
Analysis process
Analyze the rules of the topic
- Each word has 0 or more
-
spaces
_
such as (Foo
,--foo
,__FOO
,_BAR
,Bar
) -
space
_
may not be followed by anything such as (__
,--
)
const camelCase = (string) => {
// 注意(.)?这里的?是为了满足条件2
const camelCaseRegex = /[-_\s]+(.)?/g
return string.replace(camelCaseRegex, (match, char) => {
return char ? char.toUpperCase() : ''
})
}
console.log(camelCase('foo Bar')) // fooBar
console.log(camelCase('foo-bar--')) // fooBar
console.log(camelCase('foo_bar__')) // fooBar
10. Convert the first letter of the string to uppercase, and the rest to lowercase
For example, hello world is converted to Hello World
Regular result
const capitalize = (string) => {
const capitalizeRegex = /(?:^|\s+)\w/g
return string.toLowerCase().replace(capitalizeRegex, (match) => match.toUpperCase())
}
Analysis process
Find the first letter of the word and convert it to uppercase letters. The word may start with or
multiple spaces.
const capitalize = (string) => {
const capitalizeRegex = /(?:^|\s+)\w/g
return string.toLowerCase().replace(capitalizeRegex, (match) => match.toUpperCase())
}
console.log(capitalize('hello world')) // Hello World
console.log(capitalize('hello WORLD')) // Hello World
11. Get the image addresses of all img tags in the webpage
The requirement must be an online link such ashttps://xxx.juejin.com/a.jpg
,http://xxx.juejin.com/a.jpg
,//xxx.juejjin.com/a.jpg
Analysis process
The classmates who usually write about crawlers must be familiar with the URL matching the img tag. In order to accurately capture the image address of the young lady, you must have used all your ingenuity, and finally got your wish.
Limited in the title
- Image tag
img
- Need to be in the form of online links, some base64 pictures need to be filtered out
Next, let’s look at the results directly, and see what this regular expression means through visualization.
const matchImgs = (sHtml) => {
const imgUrlRegex = /<img[^>]+src="((?:https?:)?\/\/[^"]+)"[^>]*?>/gi
let matchImgUrls = []
sHtml.replace(imgUrlRegex, (match, $1) => {
$1 && matchImgUrls.push($1)
})
return matchImgUrls
}
We divide the regularity into several parts to look at
- The part between img tag and src, as long as it is not >, anything else is fine
The part in brackets, which is the url part we want to extract, exists as a capture group, which is convenient for direct access
2.1 (?:https?:)? means that the protocol header is http: or https:
2.2 What's outside the brackets?
//xxx.juejjin.com/a.jpg
means that there can be no protocol header, that is, the link in the form of 0617205dee5b24 is supported2.3 Two slashes followed
2.4 Because the part within src="" double quotation marks is a link,
[^"]+
means everything except "- Then there is the part between "to the end of img tag>, except for >,
[^>]*?
Try result
We go to know , open the console, we can see that it is in line with expectations.
12. Get URL query parameters by name
Regular result
const getQueryByName = (name) => {
const queryNameRegex = new RegExp(`[?&]${name}=([^&]*)(&|$)`)
const queryNameMatch = window.location.search.match(queryNameRegex)
// 一般都会通过decodeURIComponent解码处理
return queryNameMatch ? decodeURIComponent(queryNameMatch[1]) : ''
}
Analysis process
name= on the url query may be
followed by the question mark? name=front-end fat fish&sex=boy
in the last position? sex=boy&name=front fat head fish
between 1 and 2? sex=boy&name=front-end fat fish&age=100
So as long as you deal with three places, you can basically get it through regularization.
- The name can only be preceded by? or &
- The value of value can be anything except & thought
- Value can only be followed by & or the end position
const getQueryByName = (name) => {
const queryNameRegex = new RegExp(`[?&]${name}=([^&]*)(?:&|$)`)
const queryNameMatch = window.location.search.match(queryNameRegex)
// 一般都会通过decodeURIComponent解码处理
return queryNameMatch ? decodeURIComponent(queryNameMatch[1]) : ''
}
// 1. name在最前面
// https://juejin.cn/?name=前端胖头鱼&sex=boy
console.log(getQueryByName('name')) // 前端胖头鱼
// 2. name在最后
// https://juejin.cn/?sex=boy&name=前端胖头鱼
console.log(getQueryByName('name')) // 前端胖头鱼
// 2. name在中间
// https://juejin.cn/?sex=boy&name=前端胖头鱼&age=100
console.log(getQueryByName('name')) // 前端胖头鱼
13. Match the 24-hour system time
To determine whether the time meets the requirements of the 24-hour system, the matching rules are as follows
01:14
1:14
1:1
23:59
Regular result
const check24TimeRegexp = /^(?:(?:0?|1)\d|2[0-3]):(?:0?|[1-5])\d$/
Analysis process
hours and
minutes of the 24-hour time system need to be met respectively
when
- The first digit can be 012
second
2.1 When the first digit is 01, the second digit can be any number
2.2 When the second digit is 2, the second digit can only be 0, 1, 2, 3
points
- The first digit can be 0, 1, 2, 3, 4, 5
- The second digit can be any number
that meets the 1 and 4 rules.
const check24TimeRegexp = /^(?:[01]\d|2[0-3]):[0-5]\d$/
console.log(check24TimeRegexp.test('01:14')) // true
console.log(check24TimeRegexp.test('23:59')) // true
console.log(check24TimeRegexp.test('23:60')) // false
console.log(check24TimeRegexp.test('1:14')) // false 实际需要支持
console.log(check24TimeRegexp.test('1:1')) // false 实际需要支持
second step, write out the case where both the hour and minute can be singular
const check24TimeRegexp = /^(?:(?:0?|1)\d|2[0-3]):(?:0?|[1-5])\d$/
console.log(check24TimeRegexp.test('01:14')) // true
console.log(check24TimeRegexp.test('23:59')) // true
console.log(check24TimeRegexp.test('23:60')) // false
console.log(check24TimeRegexp.test('1:14')) // true
console.log(check24TimeRegexp.test('1:1')) // true
14. Match Date Format
Requires matching (yyyy-mm-dd, yyyy.mm.dd, yyyy/mm/dd), for example2021-08-22
,2021.08.22
,2021/08/22
can not consider flat leap years
Regular result
const checkDateRegexp = /^\d{4}([-\.\/])(?:0[1-9]|1[0-2])\1(?:0[1-9]|[12]\d|3[01])$/
Analysis process
date format is divided into three parts
yyyy year part This part can be as long as four digits
\d{4}
mm month part
2.1 There are only 12 months in a year,
0\d
2.2 October and beyond
1[0-2]
dd day part
3.1 The maximum number of days in a month is 31 days
3.2 The smallest is No. 1
separator
It should be noted that the delimiter must not be the same 2021.08-22
Based on the above analysis, we can write
const checkDateRegexp = /^\d{4}([-\.\/])(?:0[1-9]|1[0-2])\1(?:0[1-9]|[12]\d|3[01])$/
console.log(checkDateRegexp.test('2021-08-22')) // true
console.log(checkDateRegexp.test('2021/08/22')) // true
console.log(checkDateRegexp.test('2021.08.22')) // true
console.log(checkDateRegexp.test('2021.08/22')) // false
console.log(checkDateRegexp.test('2021/08-22')) // false
There is a Backref #1 in the visual form, that is, the first group of back references is ([-\.\/])
, which ensures that the delimiter must be the same
15. Match the hexadecimal color value
It is required to match hexadecimal color values#ffbbad
and#FFF
Regular result
const matchColorRegex = /#(?:[\da-zA-Z]{6}|[\da-zA-Z]{3})/g
Analysis process
The hexadecimal color value consists of the following two parts
#
- 6 or 3 digits
numbers,
uppercase and lowercase letters
const matchColorRegex = /#(?:[\da-zA-Z]{6}|[\da-zA-Z]{3})/g
const colorString = '#12f3a1 #ffBabd #FFF #123 #586'
console.log(colorString.match(matchColorRegex))
// [ '#12f3a1', '#ffBabd', '#FFF', '#123', '#586' ]
We can't write the regular as /#(?:[\da-zA-Z]{3}|[\da-zA-Z]{6})/g
, because the multiple-choice branch in the regular | is a lazy match, and the previous branch is matched first. At this time, if you match '#12f3a1 #ffBabd #FFF #123 #586'
, you will get [ '#12f', '#ffB', '#FFF', '#123', '#586' ]
16. Detect URL prefix
Check whether a url is http or https protocol header
This is relatively simple, but it is often encountered in daily work.
Regular result
const checkProtocol = /^https?:/
console.log(checkProtocol.test('https://juejin.cn/')) // true
console.log(checkProtocol.test('http://juejin.cn/')) // true
console.log(checkProtocol.test('//juejin.cn/')) // false
17. Test Chinese
Check whether the string str is composed of Chinese
The most important thing is to determine the encoding range of Chinese in unicode Chinese character Unicode encoding range , if you want to add matching other than basic Chinese characters, just use the multi-select branch.
Analysis process
const checkChineseRegex = /^[\u4E00-\u9FA5]+$/
console.log(checkChineseRegex.test('前端胖头鱼'))
console.log(checkChineseRegex.test('1前端胖头鱼'))
console.log(checkChineseRegex.test('前端胖头鱼2'))
18. Match mobile phone number
Check whether a character string meets the rules of mobile phone number
Timeliness
The mobile phone number itself is time-sensitive, and major operators sometimes introduce new numbers, so our regular rules are also time-sensitive and need to be supplemented in time
regularity
For specific rules, you can check China mainland mobile terminal communication number
Parsing process
Regular reference from ChinaMobilePhoneNumberRegex
const mobileRegex = /^(?:\+?86)?1(?:3\d{3}|5[^4\D]\d{2}|8\d{3}|7(?:[235-8]\d{2}|4(?:0\d|1[0-2]|9\d))|9[0-35-9]\d{2}|66\d{2})\d{6}$/
console.log(mobileRegex.test('18379867725'))
console.log(mobileRegex.test('123456789101'))
console.log(mobileRegex.test('+8618379867725'))
console.log(mobileRegex.test('8618379867725'))
When encountering a long and seemingly complicated regular, is there any good way for us to understand it?
can use visualization tools to assist us in disassembling regulars.
So mobileRegex can be divided into the following parts
(?:\+?86)?
: mobile phone prefix, the non-reference group is identified?:
- 1: All mobile phone numbers start with 1
- (a|b|c|...): Various situations of 2~5 digits, explained one by one through multiple selection branches|
- \d{6}: 6 arbitrary numbers
After disassembling it, you will find that it is not complicated, but the third part is because there are too many possibilities. It uses a lot of branch selection to explain. As long as the mobile phone number rules are cleared, the rules in each group are not difficult. NS.
19. English words with leading and trailing spaces
A string composed of alphabetic Chinese characters, with regular spaces before and after English words.
For example:you say come, go is go =>
you say come, go is go example
Parsing process
Here, as long as you understand the concept of the position of \b
\b
means the boundary of the word, specifically there are three rules
- The position between \w and \W
- The position between ^ and \w
- The position between \w and $
so:
The first word you
meets rule 2,
The second word come, conforms to rule 1,
The third word conforms to go, conforms to rule 3
const wordRegex = /\b/g
console.log('you说来是come,去是go'.replace(/\b/g, ' ')) // ` you 说来是 come ,去是 go `
20. String case reversed
Invert the case of the string, for example, hello WORLD => HELLO world
Parsing process
This question is easier to think of is to determine the case through the ASCII code, and then convert it to the corresponding value, but since it is a summary of the regular, we will try to complete it through the regular.
How to determine whether a character is uppercase without passing ASCII code? In fact, just change it to uppercase characters and compare it with metacharacters. If they are equal, it means that the far characters are also uppercase. for example
对于字符串 x = `A`
'A'.toUpperCase()得到的y是A
y === x
那么x就是大写字符
So the title can be written like this
const stringCaseReverseReg = /[a-z]/ig
const string = 'hello WORLD'
const string2 = string.replace(stringCaseReverseReg, (char) => {
const upperStr = char.toUpperCase()
// 大写转小写,小写转大写
return upperStr === char ? char.toLowerCase() : upperStr
})
console.log(string2) // HELLO world
folder and file path under windows
It is required to match the following path
- C:\Documents\Newsletters\Summer2018.pdf
- C:\Documents\Newsletters\
- C:\Documents\Newsletters
- C:\
Regular result
const windowsPathRegex = /^[a-zA-Z]:\\(?:[^\\:*<>|"?\r\n/]+\\?)*(?:(?:[^\\:*<>|"?\r\n/]+)\.\w+)?$/;
Parsing process
The file rules under windows probably consist of these parts
Disk letter: \folder\folder\file
- Disk character: only English form
[a-zA_Z]:\\
- Folder name: does not contain some special symbols and can appear any number of times, the last \ can be without
([^\\:*<>|"?\r\n/]+\\?)*
- File name:
([^\\:*<>|"?\r\n/]+)\.\w+
, but the file can be missing
const windowsPathRegex = /^[a-zA-Z]:\\(?:[^\\:*<>|"?\r\n/]+\\?)*(?:(?:[^\\:*<>|"?\r\n/]+)\.\w+)?$/;
console.log( windowsPathRegex.test("C:\\Documents\\Newsletters\\Summer2018.pdf") ); // true
console.log( windowsPathRegex.test("C:\\Documents\Newsletters\\") ); // true
console.log( windowsPathRegex.test("C:\\Documents\Newsletters") ); // true
console.log( windowsPathRegex.test("C:\\") ); // true
22. Matching id (usually used when writing crawlers to get html)
Request the id box in <div id="box">hello world</div>
Regular result
const matchIdRegexp = /id="([^"]*)"/
console.log(`
<div id="box">
hello world
</div>
`.match(matchIdRegexp)[1])
Parsing process
In the process of writing a crawler, it is often necessary to match the dom elements of the specified conditions, and then do the corresponding operations. So how to get the box
<div id="box">
hello world
</div>
I believe that the first thing you think of is this regular id="(.*)"
const matchIdRegexp = /id="(.*)"/
console.log(`
<div id="box">
hello world
</div>
`.match(matchIdRegexp)[1])
But id="(.*)"
can easily lead to backtracking, which consumes more matching time. Is there any way to optimize it?
Yes, you only need to change .
to [^"]
. When it encounters ", the regular rule considers that the match is over, and no backtracking will occur.
const matchIdRegexp = /id="([^"]*)"/
console.log(`
<div id="box">
hello world
</div>
`.match(matchIdRegexp)[1])
23. Match id extension (get all id of Nuggets homepage html)
Let's try to get the id in batches
Regular result
const idRegexp = /id="([^"]+)"/g
document.body.innerHTML
.match(idRegexp)
.map((idStr) => idStr.replace(idRegexp, '$1'))
24. Greater than or equal to 0, less than or equal to 150, supports 5 decimal places, such as 145.5, used to judge the test paper score
Regular result
const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)(?:\.5)?$/
Analysis process
We can divide this question into two parts
Integer part
- One-digit integer
- Ten-digit integer
- Hundreds integer but less than 150
- Decimal part: only
.5
or none
First try to write the integer part
// 1. 如何表示个位数? /\d/
// 2. 如何表示十位数? /[1-9]\d/
// 3. 个位和十位如何一起表示? /[1-9]?\d/
// 4. 小于150的百位数呢? /1[0-4]\d/
// 所以结合起来整数部分可以用以下正则表示
const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)?$/
console.log(pointRegex.test(0)) // true
console.log(pointRegex.test(10)) // true
console.log(pointRegex.test(100)) // true
console.log(pointRegex.test(110.5)) // false
console.log(pointRegex.test(150)) // true
plus the decimal part
// 小数部分相对简单 /(?:\.5)?/,所以整体结合起来就是
const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)(?:\.5)?$/
console.log(pointRegex.test(-1)) // false
console.log(pointRegex.test(0)) // true
console.log(pointRegex.test(10)) // true
console.log(pointRegex.test(100)) // true
console.log(pointRegex.test(110.5)) // true
console.log(pointRegex.test(150)) // true
console.log(pointRegex.test(151)) // false
25. Determine the version number
The version number must be in XYZ format, where XYZ is at least one digit
Regular result
// x.y.z
const versionRegexp = /^(?:\d+\.){2}\d+$/
console.log(versionRegexp.test('1.1.1'))
console.log(versionRegexp.test('1.000.1'))
console.log(versionRegexp.test('1.000.1.1'))
Meet bye
There is still a long way to go to make good use of regular rules. I hope these analysis will be helpful to everyone! If there are any errors in the article, or if you have a better regular writing method, you are welcome to raise them.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。