富文本字符串截取图片src

我想要获取富文本图片的src,请问使用正则能获取嘛,因为图片是base64的,然后里面单引号双引号都有,使用通用的正则就会少截一段

const htmlstr=`<p>&nbsp;&nbsp;<img align="middle" alt="U 等於 左小括號 I 下標 S 加 I 右小括號 R 下標 0" class="Wirisformula" data-mathml="«math xmlns=¨http://www.w3.org/1998/Math/MathML¨»«mi»U«/mi»«mo»=«/mo»«mfenced»«mrow»«msub»«mi»I«/mi»«mi»S«/mi»«/msub»«mo»+«/mo»«mi»I«/mi»«/mrow»«/mfenced»«msub»«mi»R«/mi»«mn»0«/mn»«/msub»«/math»" height="25" role="math" src="data:image/svg+xml;charset=utf8,%3Csvg%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%22%20xmlns%3Awrs%3D%22http%3A%2F%2Fwww.wiris.com%2Fxml%2Fmathml-extension%22%20height%3D%2225%22%20width%3D%2297%22%20wrs%3Abaseline%3D%2216%22%3E%3C!--MathML%3A%20%3Cmath%20xmlns%3D%22http%3A%2F%2Fwww.w3.org%2F1998%2FMath%2FMathML%22%3E%3Cmi%3EU%3C%2Fmi%3E%3Cmo%3E%3D%3C%2Fmo%3E%3Cmfenced%3E%3Cmrow%3E%3Cmsub%3E%3Cmi%3EI%3C%2Fmi%3E%3Cmi%3ES%3C%2Fmi%3E%3C%2Fmsub%3E%3Cmo%3E%2B%3C%2Fmo%3E%3Cmi%3EI%3C%2Fmi%3E%3C%2Fmrow%3E%3C%2Fmfenced%3E%3Cmsub%3E%3Cmi%3ER%3C%2Fmi%3E%3Cmn%3E0%3C%2Fmn%3E%3C%2Fmsub%3E%3C%2Fmath%3E--%3E%3Cdefs%3E%3Cstyle%20type%3D%22text%2Fcss%22%3E%40font-face%7Bfont-family%3A'math1564b4c0e54101ac57a0cb68c16'%3Bsrc%3Aurl(data%3Afont%2Ftruetype%3Bcharset%3Dutf-8%3Bbase64%2CAAEAAAAMAIAAAwBAT1MvMi7iBBMAAADMAAAATmNtYXDEvmKUAAABHAAAADxjdnQgDVUNBwAAAVgAAAA6Z2x5ZoPi2VsAAAGUAAABK2hlYWQQC2qxAAACwAAAADZoaGVhCGsXSAAAAvgAAAAkaG10eE2rRkcAAAMcAAAADGxvY2EAHTwYAAADKAAAABBtYXhwBT0FPgAAAzgAAAAgbmFtZaBxlY4AAANYAAABn3Bvc3QB9wD6AAAE%2BAAAACBwcmVwa1uragAABRgAAAAUAAADSwGQAAUAAAQABAAAAAAABAAEAAAAAAAAAQEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAgICAAAAAg1UADev96AAAD6ACWAAAAAAACAAEAAQAAABQAAwABAAAAFAAEACgAAAAGAAQAAQACACsAPf%2F%2FAAAAKwA9%2F%2F%2F%2F1v%2FFAAEAAAAAAAAAAAFUAywAgAEAAFYAKgJYAh4BDgEsAiwAWgGAAoAAoADUAIAAAAAAAAAAKwBVAIAAqwDVAQABKwAHAAAAAgBVAAADAAOrAAMABwAAMxEhESUhESFVAqv9qwIA%2FgADq%2FxVVQMAAAEAgABVAtUCqwALAEkBGLIMAQEUExCxAAP2sQEE9bAKPLEDBfWwCDyxBQT1sAY8sQ0D5gCxAAATELEBBuSxAQETELAFPLEDBOWxCwX1sAc8sQkE5TEwEyERMxEhFSERIxEhgAEAVQEA%2FwBV%2FwABqwEA%2FwBW%2FwABAAACAIAA6wLVAhUAAwAHAGUYAbAIELAG1LAGELAF1LAIELAB1LABELAA1LAGELAHPLAFELAEPLABELACPLAAELADPACwCBCwBtSwBhCwB9SwBxCwAdSwARCwAtSwBhCwBTywBxCwBDywARCwADywAhCwAzwxMBMhNSEdASE1gAJV%2FasCVQHAVdVVVQAAAQAAAAEAANV4zkFfDzz1AAMEAP%2F%2F%2F%2F%2FWOhNz%2F%2F%2F%2F%2F9Y6E3MAAP8gBIADqwAAAAoAAgABAAAAAAABAAAD6P9qAAAXcAAA%2F7YEgAABAAAAAAAAAAAAAAAAAAAAAwNSAFUDVgCAA1YAgAAAAAAAAAAoAAAAoQAAASsAAQAAAAMAXgAFAAAAAAACAIAEAAAAAAAEAADeAAAAAAAAABUBAgAAAAAAAAABABIAAAAAAAAAAAACAA4AEgAAAAAAAAADADAAIAAAAAAAAAAEABIAUAAAAAAAAAAFABYAYgAAAAAAAAAGAAkAeAAAAAAAAAAIABwAgQABAAAAAAABABIAAAABAAAAAAACAA4AEgABAAAAAAADADAAIAABAAAAAAAEABIAUAABAAAAAAAFABYAYgABAAAAAAAGAAkAeAABAAAAAAAIABwAgQADAAEECQABABIAAAADAAEECQACAA4AEgADAAEECQADADAAIAADAAEECQAEABIAUAADAAEECQAFABYAYgADAAEECQAGAAkAeAADAAEECQAIABwAgQBNAGEAdABoACAARgBvAG4AdABSAGUAZwB1AGwAYQByAE0AYQB0AGgAcwAgAEYAbwByACAATQBvAHIAZQAgAE0AYQB0AGgAIABGAG8AbgB0AE0AYQB0AGgAIABGAG8AbgB0AFYAZQByAHMAaQBvAG4AIAAxAC4AME1hdGhfRm9udABNAGEAdABoAHMAIABGAG8AcgAgAE0AbwByAGUAAAMAAAAAAAAB9AD6AAAAAAAAAAAAAAAAAAAAAAAAAAC5BxEAAI2FGACyAAAAFRQTsQABPw%3D%3D)format('truetype')%3Bfont-weight%3Anormal%3Bfont-style%3Anormal%3B%7D%40font-face%7Bfont-family%3A'round_brackets22549f92a457f2409'%3Bsrc%3Aurl(data%3Afont%2Ftruetype%3Bcharset%3Dutf-8%3Bbase64%2CAAEAAAAMAIAAAwBAT1MvMjxkLiAAAADMAAAATmNtYXDf7xCrAAABHAAAADxjdnQgBAkDLgAAAVgAAAASZ2x5Zr5a4R4AAAFsAAABKWhlYWQO9ymoAAACmAAAADZoaGVhDVUVZQAAAtAAAAAkaG10eCFnAAIAAAL0AAAADGxvY2EAAARdAAADAAAAABBtYXhwBIgEWQAAAxAAAAAgbmFtZXHO2TgAAAMwAAACOXBvc3QEagIzAAAFbAAAACBwcmVwupWEAAAABYwAAAAHAAAGrgGQAAUAAAgACAAAAAAACAAIAAAAAAAAAQIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAgICAAAAAo8AMGe%2F57AAAIzgIPAAAAAAACAAEAAQAAABQAAwABAAAAFAAEACgAAAAGAAQAAQACACgAKf%2F%2FAAAAKAAp%2F%2F%2F%2F2f%2FZAAEAAAAAAAAAAAFUAFYBAAAsAKgDgAAyAAcAAAACAAAAKgDVA1UAAwAHAAA1MxEjEyMRM9XVq4CAKgMr%2FQAC1QABAAD%2BYAJQCGAACQBNGAGwChCwA9SwAxCwAtSwChCwBdSwBRCwANSwAxCwBzywAhCwCDwAsAoQsAPUsAMQsAfUsAoQsAXUsAoQsADUsAMQsAI8sAcQsAg8MTATAgEzABMQASMABAQBwJD%2BPAQBwJD%2BRANg%2FHD%2BkAFwA5ADoAFg%2FqAAAQAA%2FmACUAhgAAkATRgBsAoQsAPUsAMQsALUsAoQsAXUsAUQsADUsAMQsAc8sAIQsAg8ALAKELAD1LADELAH1LAKELAF1LAKELAA1LADELACPLAHELAIPDEwARIBIwADEAEzAAJMBP5AkAHEBP5AkAG8A2D8cP6QAXADkAOgAWD%2BoAAAAAABAAAAAQAALiwXwl8PPPUAAwgA%2F%2F%2F%2F%2F9Wt7vT%2F%2F%2F%2F%2F1a3u9AAA%2FmAEhAhgAAAACgACAAEAAAAAAAEAAAjO%2FfEAABdwAAD%2F%2FgSEAAEAAAAAAAAAAAAAAAAAAAADANUAAAJQAAACUAAAAAAAAAAAACQAAACmAAABKQABAAAAAwAKAAIAAAAAAAIAgAQAAAAAAAQAAE0AAAAAAAAAFQECAAAAAAAAAAEAPgAAAAAAAAAAAAIADgA%2BAAAAAAAAAAMAXABMAAAAAAAAAAQAPgCoAAAAAAAAAAUAFgDmAAAAAAAAAAYAHwD8AAAAAAAAAAgAHAEbAAEAAAAAAAEAPgAAAAEAAAAAAAIADgA%2BAAEAAAAAAAMAXABMAAEAAAAAAAQAPgCoAAEAAAAAAAUAFgDmAAEAAAAAAAYAHwD8AAEAAAAAAAgAHAEbAAMAAQQJAAEAPgAAAAMAAQQJAAIADgA%2BAAMAAQQJAAMAXABMAAMAAQQJAAQAPgCoAAMAAQQJAAUAFgDmAAMAAQQJAAYAHwD8AAMAAQQJAAgAHAEbAFIAbwB1AG4AZAAgAGIAcgBhAGMAawBlAHQAcwAgAHcAaQB0AGgAIABhAHMAYwBlAG4AdAAgADIAMgA1ADQAUgBlAGcAdQBsAGEAcgBNAGEAdABoAHMAIABGAG8AcgAgAE0AbwByAGUAIABSAG8AdQBuAGQAIABiAHIAYQBjAGsAZQB0AHMAIAB3AGkAdABoACAAYQBzAGMAZQBuAHQAIAAyADIANQA0AFIAbwB1AG4AZAAgAGIAcgBhAGMAawBlAHQAcwAgAHcAaQB0AGgAIABhAHMAYwBlAG4AdAAgADIAMgA1ADQAVgBlAHIAcwBpAG8AbgAgADIALgAwUm91bmRfYnJhY2tldHNfd2l0aF9hc2NlbnRfMjI1NABNAGEAdABoAHMAIABGAG8AcgAgAE0AbwByAGUAAAAAAwAAAAAAAARnAjMAAAAAAAAAAAAAAAAAAAAAAAAAALkH%2FwABjYUA)format('truetype')%3Bfont-weight%3Anormal%3Bfont-style%3Anormal%3B%7D%3C%2Fstyle%3E%3C%2Fdefs%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2216%22%20font-style%3D%22italic%22%20text-anchor%3D%22middle%22%20x%3D%226.5%22%20y%3D%2216%22%3EU%3C%2Ftext%3E%3Ctext%20font-family%3D%22math1564b4c0e54101ac57a0cb68c16%22%20font-size%3D%2216%22%20text-anchor%3D%22middle%22%20x%3D%2221.5%22%20y%3D%2216%22%3E%3D%3C%2Ftext%3E%3Ctext%20font-family%3D%22round_brackets22549f92a457f2409%22%20font-size%3D%2216%22%20text-anchor%3D%22middle%22%20x%3D%2233.5%22%20y%3D%2219%22%3E(%3C%2Ftext%3E%3Ctext%20font-family%3D%22round_brackets22549f92a457f2409%22%20font-size%3D%2216%22%20text-anchor%3D%22middle%22%20x%3D%2272.5%22%20y%3D%2219%22%3E)%3C%2Ftext%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2216%22%20font-style%3D%22italic%22%20text-anchor%3D%22middle%22%20x%3D%2238.5%22%20y%3D%2216%22%3EI%3C%2Ftext%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2212%22%20font-style%3D%22italic%22%20text-anchor%3D%22middle%22%20x%3D%2245.5%22%20y%3D%2221%22%3ES%3C%2Ftext%3E%3Ctext%20font-family%3D%22math1564b4c0e54101ac57a0cb68c16%22%20font-size%3D%2216%22%20text-anchor%3D%22middle%22%20x%3D%2257.5%22%20y%3D%2216%22%3E%2B%3C%2Ftext%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2216%22%20font-style%3D%22italic%22%20text-anchor%3D%22middle%22%20x%3D%2267.5%22%20y%3D%2216%22%3EI%3C%2Ftext%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2216%22%20font-style%3D%22italic%22%20text-anchor%3D%22middle%22%20x%3D%2282.5%22%20y%3D%2216%22%3ER%3C%2Ftext%3E%3Ctext%20font-family%3D%22Arial%22%20font-size%3D%2212%22%20text-anchor%3D%22middle%22%20x%3D%2292.5%22%20y%3D%2221%22%3E0%3C%2Ftext%3E%3C%2Fsvg%3E" style="max-width: none; vertical-align: -9px;" width="97" /></p>`
let imgList = []
function getImgSrc(htmlstr) {
    if (!htmlstr) return []
    const imgList = []
    console.log(htmlstr)
    htmlstr.replace(/<img [^>]*src=['"]([^'"]+)[^>]*>/g, (match, capture) => {
      // imgList.push(StringUtil.htmlDecodeByRegExp(capture).trim())
      imgList.push(capture)
    })
    console.log(imgList)
    return imgList
  }
阅读 2.9k
2 个回答
htmlstr.replace(/<img[^>]+?src=('|")(.+?)\1/g, (match, $1, $2) => {
  imgList.push($2)
})

base64编码本身不会包含引号,而且可以用一些方法来预提取引号类型作前后匹配来提取。

所以正常情况应该是可以提取到的,你可以参考哪些html解析库的处理。

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题