该脚本根据输入的路径,可以读取路径下的所有文件,实现匹配字符串替换,添加内容和删除内容的功能。

import os
from fileinput import FileInput

#删除内容
def match_then_delete(inputpath):
    for root,dirs,files in os.walk(inputpath):
        for file in files:
            path = os.path.join(root,file)
            output_file_path = ""+file
            print(out_file_path)
            with open(path,'r',encoding='gbk') as infile:
                input_stream=infile.read()
                output_stream=""
                #换行分切分内容
                input_stream_lines=input_stream.split("\n")
                for line in input_stream_lines:
                    if line.startwith(""):
                        pass
                    else:
                        output_stream=output_stream+line+'\n'
                #读取去掉指定内容后的新内容,重新写文件
                g = open(output_file_path,'w')
                g.write(output_stream)

#添加内容,在匹配内容match上方添加内容content
def match_then_insert(filename,match,content):
    for line in FileInput(filename,inplace=True):
        if match in line:
            line = content+'\n'+line
        print(line,end='')

#匹配字符串替换
def match_then_replace(filename,oldtext,newtext):
    for line in FileInput(filename, inplace=True):
        if oldtext in line:
            line = line.replace(oldtext,newtext)
        print(line,end='')

if __name__=='__main__':
    inputpath = ""
    for root,dirs,files in os.walk(inputpath):
        for file in files:
            path = os.path.join(root,file)
            output_file_path = inputpath+file
            match_then_replace(output_file_path,"oldtext","newtext")

需要注意的点:
当我们需要处理的文件是utf-8编码时,而python3中默认的文件解码格式是gbk,若直接使用FileInput模块,会报错误
UnicodeDecodeError: 'gbk' codec can't decode byte 0x89 in position 116: illegal multibyte sequence
若我们使用如下形式

for line in fileinput.input(filename,openhook=fileinput.hook_encoded('utf-8','')

使用openhook指定编码格式为utf-8时,此时则无法设置inplace=True,即无法写入文件
这里使用的解决办法是修改fileinput的源码,在340和360行附近,在代码中加入enconding="utf-8"
image.png

参考文章链接:https://www.cnblogs.com/bj-xy/p/6340256.html


chloe
9 声望0 粉丝