题目描述
要把以前文章中的外部链接的图片缓存下来再替换成自己内部的地址,一条数据的html带着很多图片,想用正则匹配一下,搞的头大,求助大佬。
主要有三种要正则匹配出的图片
①<img src='https://image2.135editor.com/cache/remote/aHR0cHM6Ly9tbWJpei5xbG9nby5jbi9tbWJpel9qcGcvY1pWMmhScHVBUGpYZG56aWJxV1hGOUFIbHRxMU1BODhPblY2ZldrWjltZGJWQ0V2QlYxWWFKS1JGYU9TUTc1STV6SUlDdGIycnFIUG1EbHJIZ3BCMmZBLzA/d3hfZm10PWpwZWc=' />
②<section background-image: url("https://image2.135editor.com/cache/remote/aHR0cHM6Ly9tbWJpei5xbG9nby5jbi9tbWJpel9qcGcvY1pWMmhScHVBUGpYZG56aWJxV1hGOUFIbHRxMU1BODhPblY2ZldrWjltZGJWQ0V2QlYxWWFKS1JGYU9TUTc1STV6SUlDdGIycnFIUG1EbHJIZ3BCMmZBLzA/d3hfZm10PWpwZWc=") />
②<section -webkit-border-image: url("https://image2.135editor.com/cache/remote/aHR0cHM6Ly9tbWJpei5xbG9nby5jbi9tbWJpel9qcGcvY1pWMmhScHVBUGpYZG56aWJxV1hGOUFIbHRxMU1BODhPblY2ZldrWjltZGJWQ0V2QlYxWWFKS1JGYU9TUTc1STV6SUlDdGIycnFIUG1EbHJIZ3BCMmZBLzA/d3hfZm10PWpwZWc=") />
不会java ,用python的re写的,应该差不多。
如果就是这个数据的话,可以这么写试一试
f = open('a.txt').read()
re.findall("(https://image[^;']+)",f)