Having a HPA configuration of 50%
average CPU
kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
I found the problem that I have only one
pod receiving traffic so the CPU
is higher than 50%
of request cpu.
Then start auto scaling up new pods, but those sometimes are not receiving yet any traffic, so the cpu consumption is very low.
My expectations was to see those pods that dont use any cpu to be scale down at some point(how much it should take?), but it's not happening, and I believe the reason is, that first condition of one pod cpu use, higher than 50% is forcing to keep those pods up.
What I need is to scale up/down those pods, until they can start receiving traffic, which it depends on in which node they are deployed.
Any suggestion of how to accomplish this issue?
HPA CPU Utilization:
The targetCPUUtilizationPercentage of 50 means that if average CPU utilization across all Pods goes up above 50% then HPA would scale up the deployment and if the average CPU utilization across all Pods goes below 50% then HPA would scale down the deployment if the number of replicas are more than 1. This is how it works,
I just checked the code and found that targetUtilization percentage calculation uses resource request. You can refer to below code:
Here is the link https://github.com/kubernetes/kubernetes/blob/v1.9.0/pkg/controller/podautoscaler/metrics/utilization.go#L49
There is an official walkthrough focusing on HPA and it's scaling:
Kubernetes.io: Docs: Tasks: Run application: Horizontal pod autoscale: Walkthrough
You could use newly introduced fields like
behavior
andstabilizationWindowSeconds
to your workload to your specific needs.