Is the DStream return by updateStateByKey function only contains one RDD?

891 Views Asked by At

Is the DStream return by updateStateByKey function only contains one RDD? If not,Under what circumstances will the DStream contains more than one RDD?

3

There are 3 best solutions below

0
zwb On BEST ANSWER

Yes, the DStream return by updateStateByKey only hava one RDD

6
Wesley Miao On

It contains a RDD every batch. The DStream returned by updateStateByKey is a "state" DStream. You can still view this DStream as a normal DStream though. For every batch, the RDD is representing the latest state (key-value pairs) according to your update function that you pass in to updateStateByKey.

0
zwb On

it seemed not like what you said, the code as a part of application bleow only print once every batch, so i think every stateful DStream just have only one RDD

@transient val statefulDStream = lines.transform(...).map(x => (x, 1)).updateStateByKey(updateFuncs)

statefulDStream.foreachRDD { rdd =>
  println(rdd.first())
}