Is it possible that i set fully customized metric for auto scale-out with dataproc worker node in GCP (Google Cloud Platform)

145 Views Asked by jinsu park At 28 June 2025 at 06:06

Is it possible that i set fully customized metric for auto scale-out with dataproc worker node in GCP (Google Cloud Platform)??

I want to run Spark distribution processing by dataproc in GCP. But the thing is that, i just want to horizontally scale out worker node based on fully customized metric data. The reason why i am curious about it is that prediction for future data expected to process is available.

now / now+1 / now+2 / now+3
1GB / 2GB / 1GB / 3GB <=== expected data volume (metric)

So could i predictable scale-out/in according to future expected data volumne ?? Thanks in advance.

Original Q&A

There are 1 best solutions below

Igor Dvorzhak On 05 January 2021 at 04:01

No, currently Dataproc autoscales clusters only based on YARN memory metrics.

You need to write your Spark job in a way that it requests more Spark executors (and as a result YARN memory) when it processes more data, usually it means that you need to split and partition your data more when data size increases.

Is it possible that i set fully customized metric for auto scale-out with dataproc worker node in GCP (Google Cloud Platform)

There are 1 best solutions below

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in GOOGLE-CLOUD-DATAPROC

Related Questions in STACKDRIVER

Related Questions in DATAPROC

Related Questions in SCALEOUT-HSERVER

Trending Questions

Popular # Hahtags

Popular Questions