I have a stream which gives messages map to two different map() call and further is filtered and written to two different topics.
KStream<String, byte[]>[] stream = builder.<String, byte[]>stream("source-topic");
stream.map(logic1OnData).filter(
(key, value) -> {
if (key == null || value == null)
return false;
return value.data() != null;
}).to("topic1", Produced.with(Serdes.String(), Serdes.String())
stream.map(logic2OnData).filter(
(key, value) -> {
if (key == null || value == null)
return false;
return value.data() != null;
}).to("topic2", Produced.with(Serdes.String(), Serdes.String())
Is there a way I can run stream.map(logc1OnData)... and stream.map(logic2OnData) parallel? Looks like they are running one after other i.e. the first map is executed and written to topic1 and then second map is executed and written to topic2 FYI.. I don't want num.threads.count as my stream input is from single topic and I am running multiple instances of the same application to read from source-topic topic to achieve parallelism while consuming.
What I am looking is parallelism while executing and writing to different topics
What you are looking at is the order in which your operations are added to the topology. Once the topology is executed the recorder will flow through the otpology in the order they arrive but
logic2OnData
will not wait forlogic1OnData
to finish processing before it runs.If you are concerned about performance you can look into
stream threads
if you want to get more parallelism.EDIT: it seems I may have miss-understood the question.
A single sub-topology does not let you run each branch with parallelism. However you can use
repartition()
to make a logic2OnData into its own sub-topology and everything after therepartition()
call will be able to run in parallel with everything before it.