I'm trying to create an alert on CGP/stackdriver using the http/server/response_count metric for app engine. This metric has an response_code field that I can group_by:
fetch gae_app::appengine.googleapis.com/http/server/response_count
| filter metric.response_code>=500 && metric.response_code<600
| every 10m
| group_by [metric.response_code], sum(val())
But say I want to merge all 500+ responses under a 5xx class of response and then aggregate to a single count for the range, is it possible to pre-process so the group_by in the above example yields a single time series eg 5xx? I notice that one of the load balancer metrics has a "response_code_class" of this kind, but this is NOT available for this metric.
After that I'm looking for a ratio of 5xx requests to all requests, would that even be possible with this metric?
Below is a query that does the following:
group_by
to count the 5xx responses in a 10-minute sliding window.group_by
, also count all responses in the same 10-minute sliding window.group_by
, simply compute the ratio of the two counts.Screenshot of the chart produced by a similar query:
The output of the above query is a ratio of 5xx responses to all responses.
Note: if you wanted to compute these ratios, for example, by
zone
, simply addzone
to the first argument ofgroup_by
like this:group_by [zone], sliding(10m), [countAll: ..., count5xx: ...] | value (count5xx / countAll)