Let's assume I have stateful Kafka Streams application that consumes data from topic with 3 partitions. At the moment I have 2 instances of the above application running. Let's put it like that: instance1 have partitions part1 and part2 assigned, instance2 has part3.
So now I want to add the new instance to utilize the parallelization completely.
In my understanding, as soon as I start a new instance, the rebalancing occurs: one of partitions part1 or part2 and corresponding local state stores will be migrated from the existing instance to the newly added instance. In this example, let's imagine that part1 migrates on instance3.
At the same time, I realize that new instance instance3 will not start processing new data until it restores the local state store from the changelog topic, which may take much time.
During the period from starting the application and until it restores the state store:
- does it mean that the data from
part1is not being processed and stuck untilinstance3finishes the start up? - if yes, then what are the approaches to estimate how much time will it take for
instance3to build the local state store? - during this time, are other instances not affected by rebalancing and keep processing data with no downtime (
instance1 - part2,instance2 - part3)?
Rebalancing has evolved with the recent releases:
from version 2.4.0 with KIP-429
=>
part2andpart3are not stuck and continued to be processedfrom version 2.6.0 with KIP-441
=>
part1continues to be processed oninstance1untilinstance3rebuilds the state store forpart1and ready to hand over of its processing