python使用re匹配字符串中重复出现的字母

(1)现在想要将字符串中连续出现的同个字母去重,如"abbbcccbba" -> "abcba",使用re模块的话如何优雅的完成这件事情?


(2)完成需求(1)后,能否顺便统计连续出现的个数,如"abbbcccbba" -> "a1b3c3b2a1"?


阅读 15.9k
2 个回答

(1)

>>> import re
>>> p = re.compile(ur"([a-zA-Z])(\1+)")
>>> s = "abbbcccbba"
>>> p.sub(ur"\1",s)
'abcba'
>>> 

(2)

>>> import re
>>> p = re.compile(ur"([a-zA-Z])(\1*)")
>>> s = "abbbcccbba"
>>> p.sub(lambda m: m.group(1)+str(1+len(m.group(2))), s)
'a1b3c3b2a1'

(1)

>>> import re
>>> p=re.compile(ur"(\w)(\1+)")
>>> s="abbbcccbba"
>>> p.sub(ur"\1",s)
'abcba'
>>> 

(2)

import re


def count(s):
    p = re.compile(ur"(\w)(\1+)")
    keys = list(p.sub(ur"\1", s))
    words = list(s)
    result = []
    # print keys, words
    for k in keys:
        n = 0
        # print words
        while len(words) > n and k == words[n]:
            n = n + 1
        words = words[n:]
        result.append((k, n))
        # print result
    return result

if __name__ == '__main__':
    s = "abbbcccbba"
    result = count(s)
    print ''.join(["%s%s" % x for x in result])
撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题