parallel writes different topics from single stream topic

328 Views Asked by Sudarshan sridhar At 27 July 2025 at 15:47

I have a stream which gives messages map to two different map() call and further is filtered and written to two different topics.

KStream<String, byte[]>[] stream = builder.<String, byte[]>stream("source-topic");

stream.map(logic1OnData).filter(
                (key, value) -> {
                    if (key == null || value == null)
                        return false;
                    return value.data() != null;
                }).to("topic1", Produced.with(Serdes.String(), Serdes.String())

stream.map(logic2OnData).filter(
                (key, value) -> {
                    if (key == null || value == null)
                        return false;
                    return value.data() != null;
                }).to("topic2", Produced.with(Serdes.String(), Serdes.String())

Is there a way I can run stream.map(logc1OnData)... and stream.map(logic2OnData) parallel? Looks like they are running one after other i.e. the first map is executed and written to topic1 and then second map is executed and written to topic2 FYI.. I don't want num.threads.count as my stream input is from single topic and I am running multiple instances of the same application to read from source-topic topic to achieve parallelism while consuming.

What I am looking is parallelism while executing and writing to different topics

Original Q&A

There are 1 best solutions below

wcarlson On 25 August 2020 at 19:32

What you are looking at is the order in which your operations are added to the topology. Once the topology is executed the recorder will flow through the otpology in the order they arrive but logic2OnData will not wait for logic1OnData to finish processing before it runs.

If you are concerned about performance you can look into stream threads if you want to get more parallelism.

EDIT: it seems I may have miss-understood the question.

A single sub-topology does not let you run each branch with parallelism. However you can use repartition() to make a logic2OnData into its own sub-topology and everything after the repartition() call will be able to run in parallel with everything before it.

parallel writes different topics from single stream topic

There are 1 best solutions below

Related Questions in APACHE-KAFKA

Related Questions in APACHE-KAFKA-STREAMS

Related Questions in KAFKA-PRODUCER-API

Related Questions in KAFKA-TOPIC

Related Questions in NODE-KAFKA-STREAMS

Trending Questions

Popular # Hahtags

Popular Questions