How to reduce nodes(vm) running in a Kubernetes cluster of GKE gracefully?

Question

How to reduce nodes(vm) running in a Kubernetes cluster of GKE gracefully?

1k Views Asked by k-kawa At 17 August 2025 at 05:54

I'm wondering the graceful way to reduce nodes in a Kubernetes cluster on GKE.

I have some nodes each of which has some pods watching a shared job queue and executing a job. I also have the script which monitors the length of the job queue and increase the number of instances when the length exceeds a threshold by executing gcloud compute instance-groups managed resize command and it works ok.

But I don't know the graceful way to reduce the number of instances when the length falls below the threshold.

Is there any good way to stop the pods working on the terminating instance before the instance gets terminated? or any other good practice?

Note

Each job can take around between 30m and 1h
It is acceptable if a job gets executed more than once (in the worst case...)

Original Q&A

There are 2 best solutions below

Priyatham On 15 January 2020 at 08:30

Before resizing the cluster, let's set the project context in the cloud shell by running the below commands:

gcloud config set project [PROJECT_ID]
gcloud config set compute/zone [COMPUTE_ZONE]
gcloud config set compute/region [COMPUTE_REGION]
gcloud components update

Note: You can also set project, compute zone & region as flags in the below command using --project, --zone, and --region operational flags

gcloud container clusters resize [CLUSTER_NAME] --node-pool [POOL_NAME] --num-nodes [NUM_NODES]

Run the above command for each node pool. You can omit the --node-pool flag if you have only one node pool.

Reference: https://cloud.google.com/kubernetes-engine/docs/how-to/resizing-a-cluster

**feelobot** · Accepted Answer

I think the best approach is instead of using a pod to run your tasks, use the kubernetes job object. That way when the task is completed the job terminates the container. You would only need a small pod that could initiate kubernetes jobs based on the queue.

The more kube jobs that get created, the more resources will be consumed and the cluster auto-scaler will see that it needs to add more nodes. A kube job will need to complete even if it gets terminated, it will get re-scheduled to complete.

There is no direct information in the GKE docs about whether a downsize will happen if a Job is running on the node, but the stipulation seems to be if a pod can be easily moved to another node and the resources are under-utilized it will drain the node.

Refrences

How to reduce nodes(vm) running in a Kubernetes cluster of GKE gracefully?

There are 2 best solutions below

Related Questions in KUBERNETES

Related Questions in GOOGLE-KUBERNETES-ENGINE

Trending Questions

Popular # Hahtags

Popular Questions