We are trying to setup a DataProc cluster in GCP.
While we do, we try to have secondary nodes which are either Spot VMs or Standard pre-emptible VMs (to note: spot VMs are also pre-emptible).
When we do graceful decommissioning on the Spot / Standard pre-emptible VMs, will graceful decommissioning be in effect? (the node will not be reclaimed until the tasks running in the node are complete / the graceful decommissioning reaches its threshold time limit)
[OR]
The VMs will be immediately reclaimed back by GCP? (forceful decommissioning)
Could someone please help.
Thanks in advance.
The latter "The VMs will be immediately reclaimed back by GCP (forceful decommissioning)" is correct. Dataproc is built on top of GCE. Graceful decommissioning is a (YARN cluster level) Dataproc feature while spot/pre-emptible VMs are GCE features. Dataproc has no control over the underlying GCE behavior. You need to avoid spot/pre-emptible VMs if you need graceful decommissioning.