Is the DStream return by updateStateByKey function only contains one RDD?

844 Views Asked by At

Is the DStream return by updateStateByKey function only contains one RDD? If not,Under what circumstances will the DStream contains more than one RDD?

3

There are 3 best solutions below

0
On BEST ANSWER

Yes, the DStream return by updateStateByKey only hava one RDD

0
On

it seemed not like what you said, the code as a part of application bleow only print once every batch, so i think every stateful DStream just have only one RDD

@transient val statefulDStream = lines.transform(...).map(x => (x, 1)).updateStateByKey(updateFuncs)

statefulDStream.foreachRDD { rdd =>
  println(rdd.first())
}
6
On

It contains a RDD every batch. The DStream returned by updateStateByKey is a "state" DStream. You can still view this DStream as a normal DStream though. For every batch, the RDD is representing the latest state (key-value pairs) according to your update function that you pass in to updateStateByKey.