So, I am new to Kafka and I have been reading about it for a while. I found this information on confluent.
https://docs.confluent.io/current/streams/architecture.html
So what I understood from this is, say I have a topic called plain_text where I just send a bunch of records as plain text and I just have one broker with a single topic and single partition. I now start 2 consumer instances ConsumerA and ConsumerB. Since my partition count is less than the consumer count only one of the consumers should actively consume the messages leaving the other in an idle state. Please correct me if I am wrong.
I ran a test using the kafka-console-* scripts
start a zookeeper cluster
bin/zookeeper-server-start.sh config/zookeeper.properties
start a kafka broker on localhost:9092
bin/kafka-server-start.sh config/server.properties
create a topic plain_text with one partition
bin/kafka-topics.sh --create \
--bootstrap-server localhost:9092 \
--replication-factor 1 \
--partitions 1 \
--topic plain_text
start a producer
bin/kafka-console-producer.sh --broker-list localhost:9092 --topic plain_text
start 2 consumers belonging to the same group (run the same command twice)
bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 \
--topic plain_text \
--formatter kafka.tools.DefaultMessageFormatter \
--property print.key=true \
--property print.value=true \
--property group.id=test_group \
So one of the two consumers should own that single partition (again please correct me if I am wrong), but whatever I produce on the producer console is visible on both consumer consoles. Why is that both consumers are consuming messages from a single partition. Is there something I am missing or do different rules apply to the kafka-console-* scripts.
If not specified, each kafka-console-consumer run will create a different consumer group id, you can check this using:
You can add
--group your_group_name
or--consumer-property group.id=your_group_name
to specifically registergroup.id
for your console consumers