I want to alert on failure of any serverless dataproc job. I think that I may need to create a log based metric and then an alerting policy based on that metric.
I tried creating an alerting policy with the filter below:
filter = "metric.type=\"logging.googleapis.com/log_entry_count\" resource.type=\"cloud_dataproc_batch\" metric.label.\"severity\"=\"ERROR\""
I was expecting an alert to trigger upon failure, but this metric does not seem to be active.
I tried to create a dataproc jobs standard procedure and custom procedure, and I followed this public documentation Run an Apache Spark batch workload
From the step, while creating dataproc job. Try to change the value of the Arguments to 0, instead to 1000. to received an a ERROR to cloud logging.
From Cloud Logging I tried to use this filter below:
and successfully filter the audited_resource details from Cloud Logging with severity: "ERROR"
And also try to remove the "severity = ERROR" code to check severity: "NOTICE" in the Cloud logging
Example Output: