Python 重复单词

新手上路,请多包涵

我有一个问题,我必须计算 Python (v3.4.1) 中的重复单词并将它们放在一个句子中。我使用了计数器,但我不知道如何按以下顺序获取输出。输入是:

 mysentence = As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality

我把它做成一个列表并对其进行了排序

输出应该是这样的

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

到目前为止我已经到了这一点

x=input ('Enter your sentence :')
y=x.split()
y.sort()
for y in sorted(y):
    print (y)

原文由 Erwy Lionel 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 505
2 个回答

我可以通过排序看到您的去向,因为您可以可靠地知道何时击中了一个新词并跟踪每个唯一词的计数。但是,您真正想要做的是使用散列(字典)来跟踪计数,因为字典键是唯一的。例如:

 words = sentence.split()
counts = {}
for word in words:
    if word not in counts:
        counts[word] = 0
    counts[word] += 1

现在这将为您提供一本字典,其中键是单词,值是它出现的次数。您可以使用 collections.defaultdict(int) 做一些事情,因此您只需添加值:

 counts = collections.defaultdict(int)
for word in words:
    counts[word] += 1

但甚至还有比这更好的东西…… collections.Counter 它将把你的单词列表变成一个包含计数的字典(实际上是字典的扩展)。

 counts = collections.Counter(words)

从那里您需要按计数排序的单词列表,以便您可以打印它们。 items() 将为您提供一个元组列表,并且 sorted 将(默认情况下)按每个元组的第一项(在本例中为单词)排序…这正是你要。

 import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split()
word_counts = collections.Counter(words)
for word, count in sorted(word_counts.items()):
    print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else ""))

输出

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

原文由 sberry 发布,翻译遵循 CC BY-SA 3.0 许可协议

嘿,我已经在 python 2.7(mac) 上试过了,因为我有那个版本,所以试着掌握逻辑

from collections import Counter

mysentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""

mysentence = dict(Counter(mysentence.split()))
for i in sorted(mysentence.keys()):
    print ('"'+i+'" is repeated '+str(mysentence[i])+' time.')

我希望这就是您正在寻找的东西,如果不是,那么让我高兴地学习新东西。

 "As" is repeated 1 time.
"are" is repeated 2 time.
"as" is repeated 3 time.
"certain" is repeated 2 time.
"do" is repeated 1 time.
"far" is repeated 2 time.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 time.
"of" is repeated 1 time.
"reality" is repeated 2 time.
"refer" is repeated 2 time.
"the" is repeated 1 time.
"they" is repeated 3 time.
"to" is repeated 2 time.

原文由 HimanshuGahlot 发布,翻译遵循 CC BY-SA 3.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题