Problem

Using map reduce to count word frequency.

https://hadoop.apache.org/doc...

Example

chunk1: "Google Bye GoodBye Hadoop code"
chunk2: "lintcode code Bye"

Get MapReduce result:

Bye: 2
GoodBye: 1
Google: 1
Hadoop: 1
code: 2
lintcode: 1

Solution

/**
 * Definition of OutputCollector:
 * class OutputCollector<K, V> {
 *     public void collect(K key, V value);
 *         // Adds a key/value pair to the output buffer
 * }
 */
public class WordCount {

    public static class Map {
        public void map(String key, String value, OutputCollector<String, Integer> output) {
            // Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, int value);
            StringTokenizer it = new StringTokenizer(value);
            while (it.hasMoreTokens()) {
                String str = it.nextToken();
                output.collect(str, 1);
            }
        }
    }

    public static class Reduce {
        public void reduce(String key, Iterator<Integer> values,
                           OutputCollector<String, Integer> output) {
            // Write your code here
            // Output the results into output buffer.
            // Ps. output.collect(String key, int value);
            int sum = 0;
            while (values.hasNext()) {
                sum += values.next();
            }
            output.collect(key, sum);
        }
    }
}

linspiration
161 声望53 粉丝

引用和评论

0 条评论