Prometheus query to calculate avg_over_time up-time, but want to ignore individual url wise planned downtimes

232 Views Asked by At

I am using below query to display the average up-time of a certain website in percentage for SLA monitoring.

(avg_over_time(probe_success{job="$http_job"}[24h]) * 100)

try to exclude palnned downtime(15mins or 1hr depends on url) which can be different w.r.t env/production. Can anyone pls help on this

tried with offset but it dint worked

1

There are 1 best solutions below

0
markalex On

You can add unless on() (day_of_month()==10 and hour() ==1 and minute()>0<15) to your metric, to ignore values over some period of time.

For example,

up unless on() (day_of_month() == 10 and hour() == 1 and minute() >= 0 <= 59)

will return values except for 01:00 - 02:00 on 10th of every month.

Demo can be seen here.

I case of your query, you'll need to also use subquery syntax, to apply range selector to complex query instead of instant selector.

Your resulting query will look something like this:

avg_over_time(
 (
  probe_success{job="$http_job"}
  unless on() (
    day_of_month() == 10
    and hour() == 1
    and minute() > 0 < 15)
 )[24h]
) * 100