序
本文主要讲述一下如何使用apache collections4的bag以及guava的multiset的数据结构来统计单词次数。
maven
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>22.0</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-collections4</artifactId>
<version>4.1</version>
</dependency>
bag
@Test
public void testBag(){
Bag<String> bag = new HashBag<>();
String content = "She is beautiful and she is my angel";
Arrays.stream(content.split(" ")).forEach(word -> {
bag.add(word);bag.add(word);
});
//get unique key
Set<String> set = bag.uniqueSet();
set.stream().forEach(word -> {
System.out.println(word + "-->" + bag.getCount(word));
});
}
multiset
@Test
public void testMultiSet(){
String content = "She is beautiful and she is my angel";
Multiset<String> set = HashMultiset.create();
Arrays.stream(content.split(" ")).forEach(word -> {
set.add(word);
});
set.stream().distinct().forEach(e -> {
System.out.println(e + "-->" + set.count(e));
});
}
小结
经过封装后的数据结构,用起来非常简洁。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。