Im trying to grok the datadog_monitor resource, and was wondering if anyone knew why the threshold number to trigger an alert is repeated in the query and in the monitor_thresholds section?
For instance:
resource "datadog_monitor" "p90_latency" {
name = "Latency is high on the airgap telemetry backend in ${var.stage}"
type = "query alert"
query = "avg(last_1h):avg:aws.apigateway.latency.p90{apiname:${var.stage}-airgap-telemetry-backend} > 5000"
message = <<EOF
The p90 latency on the airgap telemetry backend is {{value}} @slack-${var.monitor_slack_channel}
EOF
monitor_thresholds {
critical = 5000
warning = 2000
}
}
In my head, the query should just be query = "avg(last_1h):avg:aws.apigateway.latency.p90{apiname:${var.stage}-airgap-telemetry-backend}
to return a concrete value, and then rely on the monitor_thresholds to trigger it (since the query as-is surely just returns a boolean yes/no?)
And relatedly, how does the warning threshold work? Does datadog magically "replace" the threshold value into the query? If it's magically replaced, does it matter what the value in the query actually is (i.e. can we have a placeholder 0, or does it have to match the critical value?)
If there is a difference in thresholds between monitor_thres and query, the following error seems to occur.