Python - 获取字符串之间的差异

新手上路,请多包涵

从两个多行字符串中获取差异的最佳方法是什么?

 a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

diff = difflib.ndiff(a,b)
print ''.join(diff)

这会产生:

   t  e  s  t  i  n  g     t  h  i  s     i  s     w  o  r  k  i  n  g
     t  e  s  t  i  n  g     t  h  i  s     i  s     w  o  r  k  i  n  g     1
+  + t+ e+ s+ t+ i+ n+ g+  + t+ h+ i+ s+  + i+ s+  + w+ o+ r+ k+ i+ n+ g+  + 2

准确获取的最佳方法是什么:

testing this is working 2

正则表达式会是这里的解决方案吗?

原文由 Rekovni 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 572
2 个回答
a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

splitA = set(a.split("\n"))
splitB = set(b.split("\n"))

diff = splitB.difference(splitA)
diff = ", ".join(diff)  # ' testing this is working 2, more things if there were...'

本质上使每个字符串成为一组行,并获取集合差异 - 即 B 中不在 A 中的所有内容。然后获取该结果并将其全部连接到一个字符串中。

编辑:这是表达@ShreyasG 所说内容的一种概括方式 - [x for x if x not in y]…

原文由 Godron629 发布,翻译遵循 CC BY-SA 3.0 许可协议

最简单的 Hack,归功于 @Chris ,使用 split()

注意: 您需要确定哪个是较长的字符串,并将其用于拆分。

 if len(a)>len(b):
   res=''.join(a.split(b))             #get diff
else:
   res=''.join(b.split(a))             #get diff

print(res.strip())                     #remove whitespace on either sides

驱动值

IN : a = 'testing this is working \n testing this is working 1 \n'
IN : b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

OUT : testing this is working 2

编辑: 感谢 @ekhumoro 使用 replace 进行另一次黑客攻击,不需要任何 join 计算。

 if len(a)>len(b):
    res=a.replace(b,'')             #get diff
else:
    res=b.replace(a,'')             #get diff

原文由 Kaushik NP 发布,翻译遵循 CC BY-SA 3.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题