Java Spark how to save a JavaPairRDD<HashSet<String>, HashMap<String, Double>> to file?

575 Views Asked by At

I got this "JavaPairRDD<HashSet<String>, HashMap<String, Double>>" RDD after some complicated aggregations, want to save the result to file. I believe saveAsHadoopFile is a good API to do so, but am having trouble filling in the parameters for saveAsHadoopFile(path, keyClass, valueClass, outputFormatClass, CompressionCodec). Can anyone help?

1

There are 1 best solutions below

1
On

You can use the following function and later on parse it to the desired result.

rdd.saveAsTextFile ("hdfs:///complete_path_to_hdfs_file/");

but if you want to use saveAsHadoopFile API then following method can be used.

saveAsHadoopFile(complete_path_to_file, HashSet.class, HashMap.class, TextOutputFormat.class)

you can also use HadoopOutputFormat.class as the last parameter

For more information, you can refer to this link HadoopFile