PySpark reduceByKey on multiple values

11.3k Views Asked by At

If I have a K,V pair that is like:

(K, (v1, v2))
(K, (v3, v4))

How can I sum up the values such that I get (k, (v1 + v3, v2 + v4)) ?

1

There are 1 best solutions below

1
On BEST ANSWER

reduceByKey supports functions. Lets say A is the array of the Key-Value pairs.

output = A.reduceByKey(lambda x, y: x[0]+y[0], x[1]+y[1])