I am getting two types of errors in running a job on Google dataproc and it is causing executors to be lost one by one until the last executor is lost and the job fails. I have set my master node to n1-highmem-2 (2 vCPU, 13 GB memory) and have set two worker nodes to n1-highmem-8 (8 vCPU, 52 GB memory). The two errors I get are:
- "Container exited from explicit termination request."
- "Lost executor x: Executor heartbeat timed out"
From my understanding in what I could see online, I need to increase spark.executor.memoryOverhead. I don't know if this is the right answer or not, but I can't see how to change this in the Google dataproc console, and I don't know what to change it to. Any help would be great!
Thanks, jim
You can set Spark properties at cluster-level with
and/or job-level with
The former requires the
spark:
prefix, the latter doesn't. If both are set, the latter takes precedence. See more details in this doc.