I'm wondering the graceful way to reduce nodes in a Kubernetes cluster on GKE.
I have some nodes each of which has some pods watching a shared job queue and executing a job. I also have the script which monitors the length of the job queue and increase the number of instances when the length exceeds a threshold by executing gcloud compute instance-groups managed resize
command and it works ok.
But I don't know the graceful way to reduce the number of instances when the length falls below the threshold.
Is there any good way to stop the pods working on the terminating instance before the instance gets terminated? or any other good practice?
Note
- Each job can take around between 30m and 1h
- It is acceptable if a job gets executed more than once (in the worst case...)
I think the best approach is instead of using a pod to run your tasks, use the kubernetes job object. That way when the task is completed the job terminates the container. You would only need a small pod that could initiate kubernetes jobs based on the queue.
The more kube jobs that get created, the more resources will be consumed and the cluster auto-scaler will see that it needs to add more nodes. A kube job will need to complete even if it gets terminated, it will get re-scheduled to complete.
There is no direct information in the GKE docs about whether a downsize will happen if a Job is running on the node, but the stipulation seems to be if a pod can be easily moved to another node and the resources are under-utilized it will drain the node.
Refrences
https://cloud.google.com/container-engine/docs/cluster-autoscaler
http://kubernetes.io/docs/user-guide/kubectl/kubectl_drain/
http://kubernetes.io/docs/user-guide/jobs/