I see there are 2 separate metrics ApproximateNumberOfMessagesVisible
and ApproximateNumberOfMessagesNotVisible
.
Using number of messages visible causes processing pods to get triggered for termination immediately after they pick up the message from queue, as they're no longer visible. If I use number of messages not visible, it will not scale up.
I'm trying to scale a kubernetes service using horizontal pod autoscaler and external metric from SQS. Here is template external metric:
apiVersion: metrics.aws/v1alpha1
kind: ExternalMetric
metadata:
name: metric-name
spec:
name: metric-name
queries:
- id: metric_name
metricStat:
metric:
namespace: "AWS/SQS"
metricName: "ApproximateNumberOfMessagesVisible"
dimensions:
- name: QueueName
value: "queue_name"
period: 60
stat: Average
unit: Count
returnData: true
Here is HPA template:
kind: HorizontalPodAutoscaler
apiVersion: autoscaling/v2beta1
metadata:
name: hpa-name
spec:
scaleTargetRef:
apiVersion: apps/v1beta1
kind: Deployment
name: deployment-name
minReplicas: 1
maxReplicas: 50
metrics:
- type: External
external:
metricName: metric-name
targetAverageValue: 1
The problem will be solved if I can define another custom metric that is a sum of these two metrics, how else can I solve this problem?
We used a lambda to fetch two metrics and publish a custom metric that is sum of messages in-flight and waiting, and trigger this lambda using cloudwatch events at whatever frequency you want, https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#rules:action=create
Here is lambda code for reference: