difflib模块

difflib为python的标准库模块,无需安装。用来对比文本之间的差异。并且支持输出可读性比较强的HTML文档,与LInux下的diff 命令相似。在版本控制方面非常有用。

codecs模块

open打开文件只能写入str类型,不管字符串是什么编码方式。
但是有时候我们爬虫或者其他方式得到一些数据写入文件时会有编码不统一的问题,所以就一般都统一转换为unicode。此时的open打开文件会报错。
写入时,如果参数 是unicode,则使用open()时指定的编码进行编码后写入;如果是str,则先根据源代码文件声明的字符编码,解码成unicode后再进行前述 操作。相对内置的open()来说,这个方法比较不容易在编码上出现问题。

difflib对比


import  difflib
import codecs

# ['', '1 line', '2 line']
text1 = '''  
    1. Beautiful is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)


text2 = '''  
    1. Beautifu  is better than ugly.
    2. Explicit is better than implicit.
    3. Simple is better than complex.
    4. Complex is better than complicated.
'''.splitlines(keepends=True)



# 1. 以字符串方式展示两个文本的不同, 效果如下:
d = difflib.Differ()
result = list(d.compare(text1, text2))
result = " ".join(result)
print(result)
"""
 -     1. Beautiful is better than ugly.
 ?                ^
 +     1. Beautifu  is better than ugly.
 ?                ^
       2. Explicit is better than implicit.
       3. Simple is better than complex.
       4. Complex is better than complicated.
"""


# 2. 以html方式展示两个文本的不同, 浏览器打开:
d = difflib.HtmlDiff()
with codecs.open("diff.html", 'w','utf-8') as f:
    f.write(d.make_file(text1, text2))

图片描述
图片描述

difflib示例

import  difflib
import codecs    
file1="D:\python_need\data.txt"
file2='D:\python_need\cp.txt'


with open(file1)  as f1, open(file2) as f2:
    text1 = f1.readlines()
    text2 = f2.readlines()

d = difflib.HtmlDiff()
with codecs.open("passwd.html", 'w','utf-8') as f:
    f.write(d.make_file(text1, text2))

图片描述
图片描述

封装difflib模块

使调用'mydiff 文件1 文件2'命令
生成一个html文件,网页读取两者不同之处

#!/home/kiosk/anaconda2/envs/mysql3/bin/python3

#解释器如上
"""如果要直接调用mydiff,需要添加文件到/usr/local/bin,
封装difflib模块,文件命名为mydiff
Terminal命令:sudo cp mydiff /usr/local/bin"""
import difflib
import os
import sys
"""mydiff /etc/passwd /tmp/passwd >>differ.html"""


if len(sys.argv) == 3:
    # 命令行跟随的参数
    file1 = sys.argv[1]
    file2 = sys.argv[2]
    with open(file1) as f1 ,open(file2) as f2:
        text1 = f1.readlines()
        text2 = f2.readlines()
    d = difflib.HtmlDiff()
    with open('differ.html', 'w') as f:
        f.write(d.make_file(text1, text2))
else:
    print("""
    Usage : %s 文件1 文件2 - 返回一个html页面
    """ %(os.path.basename(sys.argv[0])))

图片描述
图片描述


SheenStar
168 声望26 粉丝

祝你坚强