I have numbers like key,value(1,2),(3,4),(5,6) ,(7,8),(9,10),(2,11),(4,12),(6,13),(8,14),(14,19)
my input is (1,2),(3,4),(5,6) ,(7,8),(9,10),(2,11),(4,12),(6,13),(8,14)
here i need to create relation 1 --> 2 and 2--> 11
my final output is(1,11)
..i.e. if you consider first tuple key is 1 and 2 value again one of the other given tuple 2 is key and 11 is value.i.e.parent and child and grand child relation i want my output is like (parent,grand child)
my final output should be: (1,11),(3,12),(5,13),(7,19),(9,10)
Suppose i have a dataframe like below:
key value
1 2
3 4
5 6
7 8
9 10
2 11
4 12
6 13
8 14
14 19
19 23
13 17
my excepted output is new df:
key value
1 11
3 12
5 17
7 19
9 10
how to implement in python /pyspark?
Not tested, but something like this should do the trick:
OUTPUT: