53
头图

Preface

" public account. Maybe you have never met before, but it is very likely that you are too late to meet.

Digit thousands division, cell phone number 3-3-4 format splicing, trim function implementation, HTML escaping, obtaining url query parameters...Do you often encounter it in interviews and work? Let's take a look at how to use regular rules to catch them all in one go! ! !

1. Digital price per thousand divisions

Turn 123456789 into 123,456,789

question is estimated to be frequently encountered in interviews and work, and it appears more frequently.

Regular result

'123456789'.replace(/(?!^)(?=(\d{3})+$)/g, ',') // 123,456,789

image.png

supplements small thousands of quantile support

carbon.png

Analysis process

The title probably means:

  1. From back to add a comma before every three digits
  2. Do not add a comma at the beginning (for example: 123 cannot become ,123 )

Does it fit the rule of (?=p)? p can represent every three digits, and the position of the comma to be added is exactly the position matched by (?=p).

first step, try to get the first comma out


let price = '123456789'
let priceReg = /(?=\d{3}$)/

console.log(price.replace(proceReg, ',')) // 123456,789

second step, get all the commas out

To get all the commas out, the main problem to be solved is how to represent a group of three numbers , which is a multiple of 3. We know that regular square brackets can turn a p pattern into a small whole, so using the characteristics of brackets, you can write like this


let price = '123456789'
let priceReg = /(?=(\d{3})+$)/g

console.log(price.replace(priceReg, ',')) // ,123,456,789

third step, remove the first comma,

The above has basically fulfilled the requirements, but it is not enough. There will be a comma in the first place. How to remove the first comma? Think about whether there is a knowledge that satisfies this scene? That's right (?! p), it's him. The combination of the two is to add a comma before every three digits from the back to the front, but this position cannot be the first ^.


let price = '123456789'
let priceReg = /(?!^)(?=(\d{3})+$)/g

console.log(price.replace(priceReg, ',')) // 123,456,789

2. Mobile phone number 3-4-4 split

Convert the phone number 18379836654 to 183-7983-6654

form collection scene, cell phone formatting frequently encountered

Regular result

let mobile = '18379836654' 
let mobileReg = /(?=(\d{4})+$)/g 

console.log(mobile.replace(mobileReg, '-')) // 183-7983-6654

image.png

Analysis process

With the above number of thousandths division method, I believe it will be much easier to do this question, that is, to find such a position from the back to the front:

The position before every four digits, and replace this position with-


let mobile = '18379836654'
let mobileReg = /(?=(\d{4})+$)/g

console.log(mobile.replace(mobileReg, '-')) // 183-7983-6654

3. Mobile phone number 3-4-4 split extension

To convert the mobile phone number 18379836654 to 183-7983-6654, the following conditions need to be met
  1. 123 => 123
  2. 1234 => 123-4
  3. 12345 => 123-45
  4. 123456 => 123-456
  5. 1234567 => 123-4567
  6. 12345678 => 123-4567-8
  7. 123456789 => 123-4567-89
  8. 12345678911 => 123-4567-8911

Think about it, this is actually the process that we often encounter when users enter their mobile phone numbers, which require constant formatting.

Regular result


const formatMobile = (mobile) => {
  return String(mobile).slice(0,11)
      .replace(/(?<=\d{3})\d+/, ($0) => '-' + $0)
      .replace(/(?<=[\d-]{8})\d{1,4}/, ($0) => '-' + $0)
}

console.log(formatMobile(18379836654))

Analysis process

It is not appropriate to use (?=p) here, for example, 1234 will become -1234. We need to find another way,
Are there other knowledge points in the regular rules that are convenient for dealing with this kind of scene? There is (?<=p)

first step is to get the first one-

const formatMobile = (mobile) => {
  return String(mobile).replace(/(?<=\d{3})\d+/, '-')      
}

console.log(formatMobile(123)) // 123
console.log(formatMobile(1234)) // 123-4

the second-out

Then we came up with the second one, the second one-exactly in the 8th position (1234567-).

const formatMobile = (mobile) => {
  return String(mobile).slice(0,11)
      .replace(/(?<=\d{3})\d+/, ($0) => '-' + $0)
      .replace(/(?<=[\d-]{8})\d{1,4}/, ($0) => '-' + $0)
}

console.log(formatMobile(123)) // 123
console.log(formatMobile(1234)) // 123-4
console.log(formatMobile(12345)) // 123-45
console.log(formatMobile(123456)) // 123-456
console.log(formatMobile(1234567)) // 123-4567
console.log(formatMobile(12345678)) // 123-4567-8
console.log(formatMobile(123456789)) // 123-4567-89
console.log(formatMobile(12345678911)) // 123-4567-8911

4. Verify the legitimacy of the password

The password length is 6-12 digits and consists of numbers, lowercase letters and uppercase letters, but it must include at least 2 characters

Regular result

let reg = /(((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z]))^[a-zA-Z\d]{6,12}$/

console.log(reg.test('123456')) // false
console.log(reg.test('aaaaaa')) // false
console.log(reg.test('AAAAAAA')) // false
console.log(reg.test('1a1a1a')) // true
console.log(reg.test('1A1A1A')) // true
console.log(reg.test('aAaAaA')) // true
console.log(reg.test('1aA1aA1aA')) // true

image.png

Analysis process

The topic consists of three conditions

  1. Password length is 6-12 digits
  2. Consists of numbers, lowercase characters, and uppercase letters
  3. Must include at least 2 characters

first step is to write conditions 1 and 2 and the regular

let reg = /^[a-zA-Z\d]{6,12}$/

second step must contain certain characters (digits, lowercase letters, uppercase letters)

let reg = /(?=.*\d)/
// 这个正则的意思是,匹配的是一个位置
// 这个位置需要满足`任意数量的符号,紧跟着是个数字`,
// 注意它最终得到的是个位置而不是其他的东西
// (?=.*\d)经常用来做条件限制

console.log(reg.test('hello')) // false
console.log(reg.test('hello1')) // true
console.log(reg.test('hel2lo')) // true

// 其他类型同理

third step, write the complete regular

Must contain two characters, there are the following four permutations and combinations

  1. Combinations of numbers and lowercase letters
  2. Combinations of numbers and capital letters
  3. Combination of lowercase and uppercase letters
  4. Numbers, lowercase letters, and uppercase letters are combined together (but in fact, the first three have covered the fourth one)
// 表示条件1和2
// let reg = /((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))/
// 表示条件条件3
// let reg = /(?=.*[a-z])(?=.*[A-Z])/
// 表示条件123
// let reg = /((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z])/
// 表示题目所有条件
let reg = /(((?=.*\d)((?=.*[a-z])|(?=.*[A-Z])))|(?=.*[a-z])(?=.*[A-Z]))^[a-zA-Z\d]{6,12}$/


console.log(reg.test('123456')) // false
console.log(reg.test('aaaaaa')) // false
console.log(reg.test('AAAAAAA')) // false
console.log(reg.test('1a1a1a')) // true
console.log(reg.test('1A1A1A')) // true
console.log(reg.test('aAaAaA')) // true
console.log(reg.test('1aA1aA1aA')) // true

5. Extract consecutive repeated characters

Extract duplicate characters, such as 1223454545666, extract ['23', '45', '6']

Regular result

const collectRepeatStr = (str) => {
  let repeatStrs = []
  const repeatRe = /(.+)\1+/g
  
  str.replace(repeatRe, ($0, $1) => {
    $1 && repeatStrs.push($1)
  })
  
  return repeatStrs
}

Analysis process

There are several key information in the title

  1. Consecutively repeated characters
  2. The length of the number of consecutively repeated characters is unlimited (for example, 23 and 45 are two digits, and 6 is one digit)

So what is continuous repetition?

11 is a continuous repeat, 22 is also a continuous repeat, and 111 is of course also. In other words, some characters must also be followed by X, which is called continuous repetition. If it is clear that X is 1, then /11+/ can also be matched, but the key is that X here is ambiguous, what should I do? .

Using back references can easily solve this problem.

that indicates that there is a repeated character.

// 这里的X可用.来表示,即所有的字符,并用括号进行引用,紧跟着反向应用\1,也就是体现了连续重复的意思啦
let repeatRe = /(.)\1/

console.log(repeatRe.test('11')) // true
console.log(repeatRe.test('22')) // true
console.log(repeatRe.test('333')) // true
console.log(repeatRe.test('123')) // false

that indicates that there are n characters repeated

Because it is not sure whether to match 11 or 45 45 + is needed in the brackets to reflect n repeated characters, and the backreference itself can be more than one, for example, 45 45 45


let repeatRe = /(.+)\1+/

console.log(repeatRe.test('11')) // true
console.log(repeatRe.test('22')) // true
console.log(repeatRe.test('333')) // true
console.log(repeatRe.test('454545')) // true
console.log(repeatRe.test('124')) // false

third step is to extract all consecutively repeated characters


const collectRepeatStr = (str) => {
  let repeatStrs = []
  const repeatRe = /(.+)\1+/g
  // 很多时候replace并不是用来做替换,而是做数据提取用
  str.replace(repeatRe, ($0, $1) => {
    $1 && repeatStrs.push($1)
  })
  
  return repeatStrs
}


console.log(collectRepeatStr('11')) // ["1"]
console.log(collectRepeatStr('12323')) // ["23"]
console.log(collectRepeatStr('12323454545666')) // ["23", "45", "6"]

6. Implement a trim function

Remove the leading and trailing spaces of the string

Regular result

// 去除空格法
const trim = (str) => {
  return str.replace(/^\s*|\s*$/g, '')    
}
// 提取非空格法
const trim = (str) => {
  return str.replace(/^\s*(.*?)\s*$/g, '$1')    
}

image.png

image.png

Analysis process

At first glance at the title, the way we flashed in our minds is to space part and keep the non-space part, but you can also change the way of thinking, and you can also extract the non-space part, regardless of the space part. Next, let's write about the implementation of the two trim methods

Method one, remove the space method


const trim = (str) => {
  return str.replace(/^\s*|\s*$/g, '')    
}

console.log(trim('  前端胖头鱼')) // 前端胖头鱼
console.log(trim('前端胖头鱼  ')) // 前端胖头鱼 
console.log(trim('  前端胖头鱼  ')) // 前端胖头鱼
console.log(trim('  前端 胖头鱼  ')) // 前端 胖头鱼

Method two, extraction of non-space method


const trim = (str) => {
  return str.replace(/^\s*(.*?)\s*$/g, '$1')    
}

console.log(trim('  前端胖头鱼')) // 前端胖头鱼
console.log(trim('前端胖头鱼  ')) // 前端胖头鱼 
console.log(trim('  前端胖头鱼  ')) // 前端胖头鱼
console.log(trim('  前端 胖头鱼  ')) // 前端 胖头鱼

7. HTML Escaping

One way to prevent XSS attacks is to do HTML escaping. The escaping rules are as follows, requiring the corresponding characters to be converted into equivalent entities. The reverse meaning is to convert the escaped entity into the corresponding character
characterEscaped entity
&&amp;
<&lt;
>&gt;
"&quot;
'&#x27;

Regular result


const escape = (string) => {
  const escapeMaps = {
    '&': 'amp',
    '<': 'lt',
    '>': 'gt',
    '"': 'quot',
    "'": '#39'
  }
  const escapeRegexp = new RegExp(`[${Object.keys(escapeMaps).join('')}]`, 'g')

  return string.replace(escapeRegexp, (match) => `&${escapeMaps[match]};`)
}

Analysis process

Global match & , < , > , " , ' , just replace them according to the above table. When a character like this may be one of many situations, we generally use the character set to do that, that is, [&<>"']

const escape = (string) => {
  const escapeMaps = {
    '&': 'amp',
    '<': 'lt',
    '>': 'gt',
    '"': 'quot',
    "'": '#39'
  }
  // 这里和/[&<>"']/g的效果是一样的
  const escapeRegexp = new RegExp(`[${Object.keys(escapeMaps).join('')}]`, 'g')

  return string.replace(escapeRegexp, (match) => `&${escapeMaps[match]};`)
}


console.log(escape(`
  <div>
    <p>hello world</p>
  </div>
`))

/*
&lt;div&gt;
  &lt;p&gt;hello world&lt;/p&gt;
&lt;/div&gt;

*/

8. HTML inversion

Regular result

Reverse meaning is just the reverse process, we can easily write

const unescape = (string) => {
  const unescapeMaps = {
    'amp': '&',
    'lt': '<',
    'gt': '>',
    'quot': '"',
    '#39': "'"
  }

  const unescapeRegexp = /&([^;]+);/g

  return string.replace(unescapeRegexp, (match, unescapeKey) => {
    return unescapeMaps[ unescapeKey ] || match
  })
}


console.log(unescape(`
  &lt;div&gt;
    &lt;p&gt;hello world&lt;/p&gt;
  &lt;/div&gt;
`))

/*
<div>
  <p>hello world</p>
</div>
*/

9. Camelize strings

The following rules, change the corresponding string into camel case
1. foo Bar => fooBar

2. foo-bar---- => fooBar

3. foo_bar__ => fooBar

Regular result

const camelCase = (string) => {
  const camelCaseRegex = /[-_\s]+(.)?/g

  return string.replace(camelCaseRegex, (match, char) => {
    return char ? char.toUpperCase() : ''
  })
}

image.png

Analysis process

Analyze the rules of the topic

  1. Each word has 0 or more - spaces _ such as ( Foo , --foo , __FOO , _BAR , Bar )
  2. - space _ may not be followed by anything such as ( __ , -- )
const camelCase = (string) => {
  // 注意(.)?这里的?是为了满足条件2
  const camelCaseRegex = /[-_\s]+(.)?/g

  return string.replace(camelCaseRegex, (match, char) => {
    return char ? char.toUpperCase() : ''
  })
}

console.log(camelCase('foo Bar')) // fooBar
console.log(camelCase('foo-bar--')) // fooBar
console.log(camelCase('foo_bar__')) // fooBar

10. Convert the first letter of the string to uppercase, and the rest to lowercase

For example, hello world is converted to Hello World

Regular result


const capitalize = (string) => {
  const capitalizeRegex = /(?:^|\s+)\w/g

  return string.toLowerCase().replace(capitalizeRegex, (match) => match.toUpperCase())
}

image.png

Analysis process

Find the first letter of the word and convert it to uppercase letters. The word may start with or multiple spaces.


const capitalize = (string) => {
  const capitalizeRegex = /(?:^|\s+)\w/g

  return string.toLowerCase().replace(capitalizeRegex, (match) => match.toUpperCase())
}

console.log(capitalize('hello world')) // Hello World
console.log(capitalize('hello WORLD')) // Hello World

11. Get the image addresses of all img tags in the webpage

The requirement must be an online link such as https://xxx.juejin.com/a.jpg , http://xxx.juejin.com/a.jpg , //xxx.juejjin.com/a.jpg

Analysis process

The classmates who usually write about crawlers must be familiar with the URL matching the img tag. In order to accurately capture the image address of the young lady, you must have used all your ingenuity, and finally got your wish.

Limited in the title

  1. Image tag img
  2. Need to be in the form of online links, some base64 pictures need to be filtered out

Next, let’s look at the results directly, and see what this regular expression means through visualization.


const matchImgs = (sHtml) => {
  const imgUrlRegex = /<img[^>]+src="((?:https?:)?\/\/[^"]+)"[^>]*?>/gi
  let matchImgUrls = []
  
  sHtml.replace(imgUrlRegex, (match, $1) => {
    $1 && matchImgUrls.push($1)
  })

  return matchImgUrls
}

We divide the regularity into several parts to look at

  1. The part between img tag and src, as long as it is not >, anything else is fine
  2. The part in brackets, which is the url part we want to extract, exists as a capture group, which is convenient for direct access

    2.1 (?:https?:)? means that the protocol header is http: or https:

    2.2 What's outside the brackets? //xxx.juejjin.com/a.jpg means that there can be no protocol header, that is, the link in the form of 0617205dee5b24 is supported

    2.3 Two slashes followed

    2.4 Because the part within src="" double quotation marks is a link, [^"]+ means everything except "

  3. Then there is the part between "to the end of img tag>, except for >, [^>]*?

image.png

Try result

We go to know , open the console, we can see that it is in line with expectations.

image.png

12. Get URL query parameters by name

Regular result

const getQueryByName = (name) => {
  const queryNameRegex = new RegExp(`[?&]${name}=([^&]*)(&|$)`)
  const queryNameMatch = window.location.search.match(queryNameRegex)
  // 一般都会通过decodeURIComponent解码处理
  return queryNameMatch ? decodeURIComponent(queryNameMatch[1]) : ''
}

Analysis process

name= on the url query may be

  1. followed by the question mark? name=front-end fat fish&sex=boy
  2. in the last position? sex=boy&name=front fat head fish
  3. between 1 and 2? sex=boy&name=front-end fat fish&age=100

So as long as you deal with three places, you can basically get it through regularization.

  1. The name can only be preceded by? or &
  2. The value of value can be anything except & thought
  3. Value can only be followed by & or the end position

const getQueryByName = (name) => {
  const queryNameRegex = new RegExp(`[?&]${name}=([^&]*)(?:&|$)`)
  const queryNameMatch = window.location.search.match(queryNameRegex)
  // 一般都会通过decodeURIComponent解码处理
  return queryNameMatch ? decodeURIComponent(queryNameMatch[1]) : ''
}
// 1. name在最前面
// https://juejin.cn/?name=前端胖头鱼&sex=boy
console.log(getQueryByName('name')) // 前端胖头鱼

// 2. name在最后
// https://juejin.cn/?sex=boy&name=前端胖头鱼
console.log(getQueryByName('name')) // 前端胖头鱼


// 2. name在中间
// https://juejin.cn/?sex=boy&name=前端胖头鱼&age=100
console.log(getQueryByName('name')) // 前端胖头鱼

13. Match the 24-hour system time

To determine whether the time meets the requirements of the 24-hour system, the matching rules are as follows
  1. 01:14
  2. 1:14
  3. 1:1
  4. 23:59

Regular result

const check24TimeRegexp = /^(?:(?:0?|1)\d|2[0-3]):(?:0?|[1-5])\d$/

image.png

Analysis process

hours and minutes of the 24-hour time system need to be met respectively

when

  1. The first digit can be 012
  2. second

    2.1 When the first digit is 01, the second digit can be any number

    2.2 When the second digit is 2, the second digit can only be 0, 1, 2, 3

points

  1. The first digit can be 0, 1, 2, 3, 4, 5
  2. The second digit can be any number

that meets the 1 and 4 rules.

const check24TimeRegexp = /^(?:[01]\d|2[0-3]):[0-5]\d$/

console.log(check24TimeRegexp.test('01:14')) // true
console.log(check24TimeRegexp.test('23:59')) // true
console.log(check24TimeRegexp.test('23:60')) // false

console.log(check24TimeRegexp.test('1:14')) // false 实际需要支持
console.log(check24TimeRegexp.test('1:1')) // false 实际需要支持

image.png

second step, write out the case where both the hour and minute can be singular

const check24TimeRegexp = /^(?:(?:0?|1)\d|2[0-3]):(?:0?|[1-5])\d$/

console.log(check24TimeRegexp.test('01:14')) // true
console.log(check24TimeRegexp.test('23:59')) // true
console.log(check24TimeRegexp.test('23:60')) // false

console.log(check24TimeRegexp.test('1:14')) // true
console.log(check24TimeRegexp.test('1:1')) // true

14. Match Date Format

Requires matching (yyyy-mm-dd, yyyy.mm.dd, yyyy/mm/dd), for example 2021-08-22 , 2021.08.22 , 2021/08/22 can not consider flat leap years

Regular result

const checkDateRegexp = /^\d{4}([-\.\/])(?:0[1-9]|1[0-2])\1(?:0[1-9]|[12]\d|3[01])$/

image.png

Analysis process

date format is divided into three parts

  1. yyyy year part This part can be as long as four digits \d{4}
  2. mm month part

    2.1 There are only 12 months in a year, 0\d

    2.2 October and beyond 1[0-2]

  3. dd day part

    3.1 The maximum number of days in a month is 31 days

    3.2 The smallest is No. 1

separator

It should be noted that the delimiter must not be the same 2021.08-22

Based on the above analysis, we can write

const checkDateRegexp = /^\d{4}([-\.\/])(?:0[1-9]|1[0-2])\1(?:0[1-9]|[12]\d|3[01])$/

console.log(checkDateRegexp.test('2021-08-22')) // true
console.log(checkDateRegexp.test('2021/08/22')) // true
console.log(checkDateRegexp.test('2021.08.22')) // true
console.log(checkDateRegexp.test('2021.08/22')) // false
console.log(checkDateRegexp.test('2021/08-22')) // false

There is a Backref #1 in the visual form, that is, the first group of back references is ([-\.\/]) , which ensures that the delimiter must be the same

image.png

15. Match the hexadecimal color value

It is required to match hexadecimal color values #ffbbad and #FFF

Regular result

const matchColorRegex = /#(?:[\da-zA-Z]{6}|[\da-zA-Z]{3})/g

image.png

Analysis process

The hexadecimal color value consists of the following two parts

  1. #
  2. 6 or 3 digits numbers, uppercase and lowercase letters

const matchColorRegex = /#(?:[\da-zA-Z]{6}|[\da-zA-Z]{3})/g
const colorString = '#12f3a1 #ffBabd #FFF #123 #586'

console.log(colorString.match(matchColorRegex))
// [ '#12f3a1', '#ffBabd', '#FFF', '#123', '#586' ]

We can't write the regular as /#(?:[\da-zA-Z]{3}|[\da-zA-Z]{6})/g , because the multiple-choice branch in the regular | is a lazy match, and the previous branch is matched first. At this time, if you match '#12f3a1 #ffBabd #FFF #123 #586' , you will get [ '#12f', '#ffB', '#FFF', '#123', '#586' ]

16. Detect URL prefix

Check whether a url is http or https protocol header

This is relatively simple, but it is often encountered in daily work.

Regular result


const checkProtocol = /^https?:/

console.log(checkProtocol.test('https://juejin.cn/')) // true
console.log(checkProtocol.test('http://juejin.cn/')) // true
console.log(checkProtocol.test('//juejin.cn/')) // false

image.png

17. Test Chinese

Check whether the string str is composed of Chinese

The most important thing is to determine the encoding range of Chinese in unicode Chinese character Unicode encoding range , if you want to add matching other than basic Chinese characters, just use the multi-select branch.

Analysis process



const checkChineseRegex = /^[\u4E00-\u9FA5]+$/

console.log(checkChineseRegex.test('前端胖头鱼'))
console.log(checkChineseRegex.test('1前端胖头鱼'))
console.log(checkChineseRegex.test('前端胖头鱼2'))

image.png

18. Match mobile phone number

Check whether a character string meets the rules of mobile phone number

Timeliness

The mobile phone number itself is time-sensitive, and major operators sometimes introduce new numbers, so our regular rules are also time-sensitive and need to be supplemented in time

regularity

For specific rules, you can check China mainland mobile terminal communication number

Parsing process

Regular reference from ChinaMobilePhoneNumberRegex


const mobileRegex = /^(?:\+?86)?1(?:3\d{3}|5[^4\D]\d{2}|8\d{3}|7(?:[235-8]\d{2}|4(?:0\d|1[0-2]|9\d))|9[0-35-9]\d{2}|66\d{2})\d{6}$/

console.log(mobileRegex.test('18379867725'))
console.log(mobileRegex.test('123456789101'))
console.log(mobileRegex.test('+8618379867725'))
console.log(mobileRegex.test('8618379867725'))

When encountering a long and seemingly complicated regular, is there any good way for us to understand it?

can use visualization tools to assist us in disassembling regulars.

So mobileRegex can be divided into the following parts

  1. (?:\+?86)? : mobile phone prefix, the non-reference group is identified ?:
  2. 1: All mobile phone numbers start with 1
  3. (a|b|c|...): Various situations of 2~5 digits, explained one by one through multiple selection branches|
  4. \d{6}: 6 arbitrary numbers

After disassembling it, you will find that it is not complicated, but the third part is because there are too many possibilities. It uses a lot of branch selection to explain. As long as the mobile phone number rules are cleared, the rules in each group are not difficult. NS.

image.png

19. English words with leading and trailing spaces

A string composed of alphabetic Chinese characters, with regular spaces before and after English words.
For example: you say come, go is go => you say come, go is go example

Parsing process

Here, as long as you understand the concept of the position of \b \b means the boundary of the word, specifically there are three rules

  1. The position between \w and \W
  2. The position between ^ and \w
  3. The position between \w and $

so:

The first word you meets rule 2,

The second word come, conforms to rule 1,

The third word conforms to go, conforms to rule 3


const wordRegex = /\b/g

console.log('you说来是come,去是go'.replace(/\b/g, ' ')) // ` you 说来是 come ,去是 go `

20. String case reversed

Invert the case of the string, for example, hello WORLD => HELLO world

Parsing process

This question is easier to think of is to determine the case through the ASCII code, and then convert it to the corresponding value, but since it is a summary of the regular, we will try to complete it through the regular.

How to determine whether a character is uppercase without passing ASCII code? In fact, just change it to uppercase characters and compare it with metacharacters. If they are equal, it means that the far characters are also uppercase. for example

对于字符串 x = `A` 
    
'A'.toUpperCase()得到的y是A

y === x

那么x就是大写字符

So the title can be written like this


const stringCaseReverseReg = /[a-z]/ig
const string = 'hello WORLD'

const string2 = string.replace(stringCaseReverseReg, (char) => {
  const upperStr = char.toUpperCase()
  // 大写转小写,小写转大写
  return upperStr === char ? char.toLowerCase() : upperStr
})

console.log(string2) // HELLO world

folder and file path under windows

It is required to match the following path
  1. C:\Documents\Newsletters\Summer2018.pdf
  2. C:\Documents\Newsletters\
  3. C:\Documents\Newsletters
  4. C:\

Regular result


const windowsPathRegex = /^[a-zA-Z]:\\(?:[^\\:*<>|"?\r\n/]+\\?)*(?:(?:[^\\:*<>|"?\r\n/]+)\.\w+)?$/;

Parsing process

The file rules under windows probably consist of these parts

Disk letter: \folder\folder\file

  1. Disk character: only English form [a-zA_Z]:\\
  2. Folder name: does not contain some special symbols and can appear any number of times, the last \ can be without ([^\\:*<>|"?\r\n/]+\\?)*
  3. File name: ([^\\:*<>|"?\r\n/]+)\.\w+ , but the file can be missing
const windowsPathRegex = /^[a-zA-Z]:\\(?:[^\\:*<>|"?\r\n/]+\\?)*(?:(?:[^\\:*<>|"?\r\n/]+)\.\w+)?$/;

console.log( windowsPathRegex.test("C:\\Documents\\Newsletters\\Summer2018.pdf") ); // true
console.log( windowsPathRegex.test("C:\\Documents\Newsletters\\") ); // true
console.log( windowsPathRegex.test("C:\\Documents\Newsletters") ); // true
console.log( windowsPathRegex.test("C:\\") ); // true

image.png

22. Matching id (usually used when writing crawlers to get html)

Request the id box in <div id="box">hello world</div>

Regular result


const matchIdRegexp = /id="([^"]*)"/

console.log(`
  <div id="box">
    hello world
  </div>
`.match(matchIdRegexp)[1])

Parsing process

In the process of writing a crawler, it is often necessary to match the dom elements of the specified conditions, and then do the corresponding operations. So how to get the box


<div id="box">
  hello world
</div>

I believe that the first thing you think of is this regular id="(.*)"

const matchIdRegexp = /id="(.*)"/

console.log(`
  <div id="box">
    hello world
  </div>
`.match(matchIdRegexp)[1])

But id="(.*)" can easily lead to backtracking, which consumes more matching time. Is there any way to optimize it?

Yes, you only need to change . to [^"] . When it encounters ", the regular rule considers that the match is over, and no backtracking will occur.


const matchIdRegexp = /id="([^"]*)"/

console.log(`
  <div id="box">
    hello world
  </div>
`.match(matchIdRegexp)[1])

23. Match id extension (get all id of Nuggets homepage html)

Let's try to get the id in batches

Regular result

const idRegexp = /id="([^"]+)"/g

document.body.innerHTML
  .match(idRegexp)
  .map((idStr) => idStr.replace(idRegexp, '$1'))

image.png

24. Greater than or equal to 0, less than or equal to 150, supports 5 decimal places, such as 145.5, used to judge the test paper score

Regular result


const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)(?:\.5)?$/

image.png

Analysis process

We can divide this question into two parts

  1. Integer part

    1. One-digit integer
    2. Ten-digit integer
    3. Hundreds integer but less than 150
  2. Decimal part: only .5 or none

First try to write the integer part


// 1. 如何表示个位数? /\d/
// 2. 如何表示十位数? /[1-9]\d/
// 3. 个位和十位如何一起表示? /[1-9]?\d/
// 4. 小于150的百位数呢? /1[0-4]\d/

// 所以结合起来整数部分可以用以下正则表示

const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)?$/

console.log(pointRegex.test(0)) // true
console.log(pointRegex.test(10)) // true
console.log(pointRegex.test(100)) // true
console.log(pointRegex.test(110.5)) // false
console.log(pointRegex.test(150)) // true

plus the decimal part


// 小数部分相对简单 /(?:\.5)?/,所以整体结合起来就是

const pointRegex = /^150$|^(?:[1-9]?\d|1[0-4]\d)(?:\.5)?$/

console.log(pointRegex.test(-1)) // false
console.log(pointRegex.test(0)) // true 
console.log(pointRegex.test(10)) // true
console.log(pointRegex.test(100)) // true
console.log(pointRegex.test(110.5)) // true
console.log(pointRegex.test(150)) // true
console.log(pointRegex.test(151)) // false

25. Determine the version number

The version number must be in XYZ format, where XYZ is at least one digit

Regular result


// x.y.z
const versionRegexp = /^(?:\d+\.){2}\d+$/

console.log(versionRegexp.test('1.1.1'))
console.log(versionRegexp.test('1.000.1'))
console.log(versionRegexp.test('1.000.1.1'))

Meet bye

There is still a long way to go to make good use of regular rules. I hope these analysis will be helpful to everyone! If there are any errors in the article, or if you have a better regular writing method, you are welcome to raise them.

前端胖头鱼
3.7k 声望6.2k 粉丝