The problem
Hi, I am trying to visualize Kafka lags using Grafana. I have been trying to log kafka lags using Metricbeat and doing the math myself since Metricbeat does not support logging Kafka lags in the version that I am using (but it has been implemented recently). Instead of using max(partition.offset.newest) - max(consumergroup.offset)
to calculate the lags, I am using sum(partition.offset.newest) - sum(consumergroup.offset)
filtered on a particular kafka.topic.name
. However, the sum does not tally, upon further investigation, I found out that the count does not even tally! The count for partition offsets is 30 per 10s while the count for consumergroup offsets is 12 per 10s. I expect the count for both to be the same
I do not understand why Metricbeat logs the partition more than the consumergroup. At first I thought it was because of my Metricbeat configuration where I have 2 host groups defined, which might caused it to be logged multiple times. However, after changing my configurations, the count just droppped by half.
TL;DR
Why is the Metricbeat counts of partition and consumergroup different?
Setup
- Kafka 2 brokers
- Kafka topic partitions:
Topic: xxx PartitionCount:3 ReplicationFactor:2 Configs:
Topic: xxx Partition: 0 Leader: 2 Replicas: 2,1 Isr: 2,1
Topic: xxx Partition: 1 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic: xxx Partition: 2 Leader: 2 Replicas: 2,1 Isr: 2,1
- Metricbeat config (modules.d/kafka.yml):
- module: kafka
#metricsets:
# - partition
# - consumergroup
period: 10s
hosts: ["xxx.yyy:9092"]
Versions
- Kafka 2.11-0.11.0.0
- Elasticsearch-7.2.0
- Kibana-7.2.0
- Metricbeats-7.2.0
after much debugging I have figured out what is wrong:
kafka.partition.topic.keyword: 'xxx'
instead. This made the ratio of my kafka producer offset to consumer offset 2:1NOT kafka.partition.partition.is_leader: false
to get all partition leaders. This made the consumer to partition ratio 1:1.After the 3 steps is done, I can use the formula
sum(partition.offset.newest) - sum(consumergroup.offset)
to get the lagsHowever, I do not know why broker 1 doesn't have the consumer information.