I have a JavaPairRDD<String, List<Tuple2<Integer, Integer>>> named rddA. For example (after collecting rddA): [(word1,[(187,267), (224,311), (187,110)]), (word2,[(187,200), (10,90)])]. Thus, for example, word1 is the key and value is [(187,267), (224,311), (187,110)])].
How can I define the corresponding JavaPairRDD<Integer, List<Integer>> to get the following ouptput:
[(187, [267, 110, 200]), (224,[311]), (10,[90])]
So, the obtained JavaPairRDDincludes three keys: 187, 224 and 10. And for example, the key 187 has [267, 110, 200] as a list value.
You simply need to flatten the list of tuples (second value of your tuple) and group by the first element of the tuple.