javascript中的词频

新手上路,请多包涵

在此处输入图像描述

如何实现 javascript 函数来计算给定句子中每个单词的频率。

这是我的代码:

 function search () {
  var data = document.getElementById('txt').value;
  var temp = data;
  var words = new Array();
  words = temp.split(" ");
  var uniqueWords = new Array();
  var count = new Array();

  for (var i = 0; i < words.length; i++) {
    //var count=0;
    var f = 0;
    for (j = 0; j < uniqueWords.length; j++) {
      if (words[i] == uniqueWords[j]) {
        count[j] = count[j] + 1;
        //uniqueWords[j]=words[i];
        f = 1;
      }
    }
    if (f == 0) {
      count[i] = 1;
      uniqueWords[i] = words[i];
    }
    console.log("count of " + uniqueWords[i] + " - " + count[i]);
  }
}

我无法找出问题所在。非常感谢任何帮助。以这种格式输出:count of is - 1 count of the - 2..

输入:这是 anil 是 kum the anil

原文由 Anil 发布,翻译遵循 CC BY-SA 4.0 许可协议

阅读 379
2 个回答

我觉得你有多个数组、字符串,并在循环和嵌套循环之间频繁(且难以遵循)上下文切换,这让事情变得过于复杂。

以下是我鼓励您考虑采用的方法。我已经内联注释来解释整个过程中的每个步骤。如果有任何不清楚的地方,请在评论中告诉我,我会重新审视以提高清晰度。

 (function () {

    /* Below is a regular expression that finds alphanumeric characters
       Next is a string that could easily be replaced with a reference to a form control
       Lastly, we have an array that will hold any words matching our pattern */
    var pattern = /\w+/g,
        string = "I I am am am yes yes.",
        matchedWords = string.match( pattern );

    /* The Array.prototype.reduce method assists us in producing a single value from an
       array. In this case, we're going to use it to output an object with results. */
    var counts = matchedWords.reduce(function ( stats, word ) {

        /* `stats` is the object that we'll be building up over time.
           `word` is each individual entry in the `matchedWords` array */
        if ( stats.hasOwnProperty( word ) ) {
            /* `stats` already has an entry for the current `word`.
               As a result, let's increment the count for that `word`. */
            stats[ word ] = stats[ word ] + 1;
        } else {
            /* `stats` does not yet have an entry for the current `word`.
               As a result, let's add a new entry, and set count to 1. */
            stats[ word ] = 1;
        }

        /* Because we are building up `stats` over numerous iterations,
           we need to return it for the next pass to modify it. */
        return stats;

    }, {} );

    /* Now that `counts` has our object, we can log it. */
    console.log( counts );

}());

原文由 Sampson 发布,翻译遵循 CC BY-SA 3.0 许可协议

这是一个 JavaScript 函数,用于获取句子中每个单词的出现频率:

 function wordFreq(string) {
    var words = string.replace(/[.]/g, '').split(/\s/);
    var freqMap = {};
    words.forEach(function(w) {
        if (!freqMap[w]) {
            freqMap[w] = 0;
        }
        freqMap[w] += 1;
    });

    return freqMap;
}

它会将单词的哈希值返回到单词计数。因此,例如,如果我们像这样运行它:

 console.log(wordFreq("I am the big the big bull."));
> Object {I: 1, am: 1, the: 2, big: 2, bull: 1}

您可以使用 Object.keys(result).sort().forEach(result) {...} 遍历单词。所以我们可以像这样把它连接起来:

 var freq = wordFreq("I am the big the big bull.");
Object.keys(freq).sort().forEach(function(word) {
    console.log("count of " + word + " is " + freq[word]);
});

哪个会输出:

 count of I is 1
count of am is 1
count of big is 2
count of bull is 1
count of the is 2

JSFiddle:http: //jsfiddle.net/ah6wsbs6/

这是 ES6 中的 wordFreq 函数:

 function wordFreq(string) {
  return string.replace(/[.]/g, '')
    .split(/\s/)
    .reduce((map, word) =>
      Object.assign(map, {
        [word]: (map[word])
          ? map[word] + 1
          : 1,
      }),
      {}
    );
}

JSFiddle: http://jsfiddle.net/r1Lo79us/

原文由 Cymen 发布,翻译遵循 CC BY-SA 3.0 许可协议

撰写回答
你尚未登录,登录后可以
  • 和开发者交流问题的细节
  • 关注并接收问题和回答的更新提醒
  • 参与内容的编辑和改进,让解决方法与时俱进
推荐问题
logo
Stack Overflow 翻译
子站问答
访问
宣传栏