Grafana Alert mismatching timestamps

76 Views Asked by At

We got alert rules creating alert instances that shall not be grouped on Azure Managed Grafana (v9.5.13), not repeated (hence high interval).

Notification policy:

{
    "receiver": "Kafka",
    "group_by": [
        "..."
    ],
    "repeat_interval": "999h",
    "group_wait": "0s",
    "group_interval": "5m"
}

Expectation: each alert instance triggers an alert notification whereas the starts_at timestamp in the "resolved" alert matches the one in the initial "firing" alert.

Reality: mismatching timestamps for "resolved" alerts. Some "firing" alerts are never "resolved". When an alert instance is still active, there should be no duplicate alert triggered.

Sample: (starts_at - resolved_at - status)

  • A triggerd and resolved with same timestamp in B.
  • C triggered and no matching timestamp, as well as for resolved D.
  • Assumption: D belongs to C, why is there a different starts_at timestamp then? Or were two alerts missed to be sent belonging to D+C.
  • E+F behave normal again.
F: 2023-12-11 12:40:00 +0000 UTC - 2023-12-11 12:50:00 +0000 UTC - resolved
E: 2023-12-11 12:40:00 +0000 UTC - 0001-01-01 00:00:00 +0000 UTC - firing
D: 2023-12-11 12:14:00 +0000 UTC - 2023-12-11 12:19:00 +0000 UTC - resolved
C: 2023-12-11 12:11:00 +0000 UTC - 0001-01-01 00:00:00 +0000 UTC - firing
B: 2023-12-11 11:52:00 +0000 UTC - 2023-12-11 11:55:00 +0000 UTC - resolved
A: 2023-12-11 11:52:00 +0000 UTC - 0001-01-01 00:00:00 +0000 UTC - firing

As those alert instances got all the same labels and therefore the same fingerprint, I could also not identify them as unique with the ID. When checking the state history in the UI, there are also sometimes alerts missing which were sent via notification policy.

Anything else still wrong in the notification policy JSON or any other trick that I'm missing? :-(

0

There are 0 best solutions below