Kafka consumer + how to validate that consumer read correctly the offset sequential

107 Views Asked by At

We are using C++ library librdkafka version 2.2.0 for producer and consumer

Apache Kafka brokers are installed on RHEL machines and brokers version – 2.7

From producer side , we verified that producer send sequential offsets to the relevant topic partition as the following

    ==> KAFKA Logger: Message delivered , 202 bytes topic data.pipe.test, partition 12,offset 4736663
    ==> KAFKA Logger: Message delivered , 202 bytes topic data.pipe.test, partition 12,offset 4736664
    ==> KAFKA Logger: Message delivered , 202 bytes topic data.pipe.test, partition 12,offset 4736665
    ==> KAFKA Logger: Message delivered , 202 bytes topic data.pipe.test, partition 12,offset 4736666

.
.
.

But from consumer side we can see that consumer read from the same topic partition - data.pipe.test partition number – 12 , but not sequential offsets as the following

actually this behavior that consumers not read the offset sequentially caused loosing data from the topic partition/s , but we are not sure if consumer log really report the right status

==> KAFKAConsumer getMessage from topic data.pipe.test, partition 12, at offset 4084746 <====
==> KAFKAConsumer getMessage from topic data.pipe.test, partition 12, at offset 4084798 <====
==> KAFKAConsumer getMessage from topic data.pipe.test, partition 12, at offset 4084812 <====
.
.
.

Consumer group name is – data_collection_group ( also we can see consumer group details from kafka-consumer-groups.sh --bootstrap-server kafka1:9093 --group data_collection_group --describe )

Note - topic data.pipe.test created with 70 partitions

From librdkafka documentation : https://github.com/confluentinc/librdkafka/blob/master/CONFIGURATION.md

So based on above , I want to ask if from Kafka cluster side or by Kafka tools as following link example - https://strimzi.io/blog/2021/12/17/kafka-segment-retention/

we can verify/validate that all messages are consumed/read correctly or maybe consumer missed some offsets

additionally I want to say:

Kafka consumer lag is a key performance indicator for the popular Kafka streaming platform.

All else equal, lower consumer lag means better Kafka performance. The table below - https://redpanda.com/guides/kafka-performance/kafka-consumer-lag

summarizes common causes of Kafka consumer lag. We will explore these causes in more detail later in this article.

so is it right to look on Kafka consumer LAG as indication about consumers that not read the data ?

0

There are 0 best solutions below