How do I combine multiple GCP alerting conditions so that only one incident is created?

880 Views Asked by At

I'm trying to create an alerting policy for a Gauge type metric in Google Cloud Platform that is expected to trigger when:

  • the gauge's value is below 1
  • data is absent

Using Terraform, I came up with the following definition:

resource "google_monitoring_alert_policy" "alert_policy_prometheus_metric" {
  display_name = "Metric check failed"

  conditions {
    display_name = "Metric violation"
    condition_threshold {
      filter                  = "resource.type = \"prometheus_target\" AND resource.labels.cluster = \"${var.cluster_name}\" AND metric.type = \"prometheus.googleapis.com/elasticsearch_cluster_health_status/gauge\" AND metric.labels.color = \"green\""
      evaluation_missing_data = "EVALUATION_MISSING_DATA_ACTIVE"
      comparison              = "COMPARISON_LT"
      duration                = "60s"
      trigger {
        count = 1
      }
      threshold_value = 1
    }
  }
  conditions {
    display_name = "Metric absent"
    condition_absent {
      duration = "120s"
      filter   = "resource.type = \"prometheus_target\" AND resource.labels.cluster = \"${var.cluster_name}\" AND metric.type = \"prometheus.googleapis.com/elasticsearch_cluster_health_status/gauge\" AND metric.labels.color = \"green\""
    }
  }
  combiner = "OR"
  notification_channels = [
    "${var.monitoring_email_group_name}"
  ]
}

This does work, however it creates two separate incidents when the following happens:

  1. the metric starts being absent
  2. the metric then reappears in a below 1 state

This is surprising to me as I use OR as the combiner for the conditions. Is there anything I can do to merge the two conditions into a single incident?

1

There are 1 best solutions below

0
On BEST ANSWER

When the multi condition trigger is set to “Any condition is met”(i.e., combiner value is "OR") and if condition 1 or 2 is met, only a single incident is created.

  When the multi condition trigger is set to “All conditions are met” and both conditions are met, two incidents are created. One incident for the event that got over the threshold for condition 1 . One incident for the event that got over the threshold for condition 2.

When the multi condition trigger is set to “All conditions are met” and one condition out of two is met, there is no incident created.

I found in the following document, that it is not possible from your end to reduce the amount of alerts to 1 when there are multiple conditions. In the shared document it is mentioned that this feature is currently not available to you:“

You can't configure Cloud Monitoring to create a single incident and send a single notification when the policy contains multiple conditions.

This feature could be implemented for GCP alerting and incidents “momentarily”. Indeed there was a similar request that was raised by one GCP customer and you can track that in this issue tracker.

Note: Once take a look at “All conditions are met even for different resources for each condition” (i.e., combiner value is "AND").