Connecting the first two nodes with an edge from two RDDs in GraphX

178 Views Asked by At

I am using GraphX for the first time and I want to build a Graph incrementally. So I need to connect the first two nodes to an edge knowing that I have 2 RDDs (each one has a single value):

firstRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]
secondRDD: RDD[((Int, Array[Int]), ((VertexId, Array[Int]), Int))]  

I want to connect the first VertexId with the second one. I appreciate your help

1

There are 1 best solutions below

1
On BEST ANSWER

Basically, you use map and case statements to pick out the VertexIds, then, use RDD.zip to stitch them together, then another map to create the final EdgeRDD:

firstRDD.map{ 
  case ((junk1,junk2), ((vertex1, junk3), junk4)) => vertex1
}.zip(
  secondRDD.map{
    case ((junk1,junk2), ((vertex2, junk3), junk4)) => vertex2 
  }
).map{ case(vertex1, vertex2) => Edge(vertex1, vertex2, 0) }