I´d like to detect times where an IP is jumping between nodes. Each time an ip jumps, it is announced by the node and that is visible via this prometheus metric: metallb_speaker_announced
This metric will show the following info:
metallb_speaker_announced{app_kubernetes_io_component="speaker", app_kubernetes_io_instance="metallb-system", app_kubernetes_io_name="metallb", instance="10.147.52.129:7472", ip="192.168.1.21", job="kubernetes-pods", kubernetes_namespace="metallb", kubernetes_node_name="node01", kubernetes_pod_name="metallb-system-spk-5whj5", node="node01", protocol="layer2", service="metallb/service-1"}
How would the PromQL expression would look like if we wanted to detect if an IP has been announced at least 3 times from at least 2 different nodes in the last 5 minutes?
To complete information for better context, metallb_speaker_announced events are triggered by different type of events and they are harmless as long as the kubernetes node making the announcement is the same. IF, the kubernetes node making the announcment alternates, that is a relevant problem that could be the consequence of things like the node having a flapping NIC or other conditions.
I'm unable to repro your example as I don't have MetalLB and a bunch of nodes but...
If we can assume that
metallb_speaker_announcedonly triggers on a new node, the first firing will be the 1st node and the second firing will be a different, 2nd node. Any subsequent firings e.g. 3rd is either from the 1st node again or from a 3rd node. So, 2+ firings is guaranteed to be >=2 nodes.Then, I think you can
sum_over_time(metallb_speaker_announced{}[5m)to sum all the announcements for the last 5 minutes.And then you can
sum by(ip) (sum_over_time(metallb_speaker_announced{}[5m))to get the results summarized byip.And then you can
sum by(ip) (sum_over_time(metallb_speaker_announced{}[5m)) >= 3to filter the results by thoseip's that occurred >=3 times.