Here is my data:RDD[Array[String]]
in spark. And I want to count the sum of all the elements length in data.
For example, data:(Array(1,2),Array(1,2,3))
. I want to get the sum: 2+3=5;
At first, I use :data.flatMap(_).count()
,
Error :
error: missing parameter type for expanded function
((x$1) => data.flatMap(x$1))
But when I replace _
with x=>x
and write: data.flatMap(x=>x).count()
, it works. So I am confused by the _
. I think in scala _
can be referred as the actual parameter type, right?
Refer to the question here.
Essentially, _ itself does not define a function. It can be used as a placeholder for a variable name when used in the anonymous function syntax, but when used by itself it means nothing.