Spark saving RDD[(Int, Array[Double])] to text file got strange result

1.6k Views Asked by At

I am trying to save the userFeature of a MatrixFactorizationModel to textFile, which according to the doc is a RDD of type [(Int, Array[Double])]. So I just called

model.userFeature.saveAsTextFile("feature")

However, the results I got are something like:

(1,[D@4b7707f1)
(5,[D@513e9aca)
(9,[D@7d09bcab)
(13,[D@31058458)
(17,[D@2a5df2a7)
(21,[D@5372efd7)
(25,[D@59d1c59a)
(29,[D@53ee5e25)
(33,[D@498f5a34)
(37,[D@4f9967eb)
(41,[D@5560afb)
(45,[D@2dc7f659)
(49,[D@b46fcc)
(53,[D@38098dd1)
(57,[D@77090fb5)
(61,[D@64769e18)

What I am expecting is something like:

(1, [1.1, 2.3, 0.4, ...])
(2, [0.1, 0.3, 0.4, ...])
...

So what's wrong?

1

There are 1 best solutions below

0
On BEST ANSWER

The behavior of saveAsTextFile is to use the toString method. So, for an Array, this is merely the hashcode. You have two options if you stick with saveAsTextFile:

.mapValues(x=>/*TURN ARRAY DATA INTO A STRING*/).saveAsTextFile...

or you can use map to wrap the data in a custom object with a custom toString, or in this case a List and its toString might work

.mapValues(_.toList).saveAsTextFile