Horizontal Pod Autoscaler with Stackdriver Custom Metric in GKE fails with "Invalid metric name" error

64 Views Asked by At

We're trying to use a custom JVM metric (jvm_memory_bytes_used{area="heap"}) to scale a deployment in our GKE cluster using a Horizontal Pod Autoscaler (HPA).

Setup:

  • Enabled Stackdriver Managed Prometheus
  • Installed JMX Exporter on JVMs and configured it to export the desired metric
  • Deployed the Stackdriver Adapter for Custom Metrics
  • Created an HPA referencing the custom metric:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-autoscale
  namespace: somenamespace
spec:
  maxReplicas: 3
  metrics:
  - pods:
      metric:
        name: jvm_memory_bytes_used{area="heap"}  # Metric name in question
      target:
        averageValue: 2G
        type: AverageValue
  type: Pods
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-java-app

Problem:

The HPA creation fails with the following error:

Error 400: Invalid metric name: custom.googleapis.com/jvm_memory_bytes_used{area="heap"},

We've tried various combinations of quotes around the metric name and the area label, but none worked.

Question:

Is it possible to use this specific custom metric for HPA scaling in GKE? If so, what's the correct way to specify it in the HPA configuration?

1

There are 1 best solutions below

2
brandizzi On BEST ANSWER

I had to change three things for getting it to work:

  1. Add proper prefix and suffix to the metric name;
  2. Use a selector clause for filtering metrics by label; and
  3. Add the proper prefix to the metric label name.

Prometheus metrics have a prefix and a suffix

From Horizontal pod autoscaling (HPA)  |  Operations Suite  |  Google Cloud

Prometheus metrics are stored with the following conventions:

  • The prefix prometheus.googleapis.com.
  • This suffix is usually one of gauge, counter, summary, or histogram, although untyped metrics might have the unknown or unknown:counter suffix. To verify the suffix, look up the metric in Cloud Monitoring by using Metrics Explorer.

So I did it, went to Metrics Explorer, enabled the build and went to search for the metrics I wanted to use:

An screenshot of Metrics Explorer showing how to find the metrics "full name"

Using a selector for metric labels

I wanted to filter by the area label, but we should not pass it in the name of the metrics. Filtering by labels should be done by using a selector in the metric clause.

Also...

Metric label names have a prefix

We cannot just add a metric name to the matchLabels / matchExpressions. Every metric label name should be prefixed by metric.labels.:

metric:
  name: prometheus.googleapis.com|jvm_memory_bytes_used|gauge
    selector:
      matchLabels:
        area: heap

The end result is this:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: my-autoscale
  namespace: somenamespace
spec:
  maxReplicas: 3
  metrics:
  - pods:
      metric:
        name: prometheus.googleapis.com|jvm_memory_bytes_used|gauge
        selector:
           matchLabels:
             metric.labels.area: heap
      target:
        averageValue: 2G
        type: AverageValue
  type: Pods
  minReplicas: 1
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-java-app

With that, I managed to make HPA respond to the custom metrics.