I'm using Kafka exporter to monitor the Kafka metrics which is then queried in prometheus. I have a Kafka topic with 3 consumer groups, these 3 consumer groups are used by 3 different services. I am trying to write a query to have an alert when either of these consumer group lag increases beyond the average lag.
The query I have so far:
kafka_consumer_group_lag{group_id=~"consumer_group.*"} > avg_over_time(kafka_consumer_group_lag{group_id=~"consumer_group.*"}[5m])
But this doesn't seem to work. I'm not sure how to form the calculation to get this.
Can someone help me in understanding how to form this query? The entire group_id will not be known, the starting of the group_id will be consumer_group hence I'm using the wild card.