When I launch a Spark program in local-cluster mode, I got the following error:
17:45:33.930 [ExecutorRunner for app-20231004174533-0000/0] ERROR org.apache.spark.deploy.worker.ExecutorRunner - Error running executor
java.lang.IllegalStateException: Cannot find any build directories.
at org.apache.spark.launcher.CommandBuilderUtils.checkState(CommandBuilderUtils.java:228) ~[spark-launcher_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.launcher.AbstractCommandBuilder.getScalaVersion(AbstractCommandBuilder.java:241) ~[spark-launcher_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.launcher.AbstractCommandBuilder.buildClassPath(AbstractCommandBuilder.java:195) ~[spark-launcher_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.launcher.AbstractCommandBuilder.buildJavaCommand(AbstractCommandBuilder.java:118) ~[spark-launcher_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:39) ~[spark-core_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.launcher.WorkerCommandBuilder.buildCommand(WorkerCommandBuilder.scala:45) ~[spark-core_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.deploy.worker.CommandUtils$.buildCommandSeq(CommandUtils.scala:63) ~[spark-core_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.deploy.worker.CommandUtils$.buildProcessBuilder(CommandUtils.scala:51) ~[spark-core_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.deploy.worker.ExecutorRunner.org$apache$spark$deploy$worker$ExecutorRunner$$fetchAndRunExecutor(ExecutorRunner.scala:160) [spark-core_2.13-3.5.0.jar:3.5.0]
at org.apache.spark.deploy.worker.ExecutorRunner$$anon$1.run(ExecutorRunner.scala:80) [spark-core_2.13-3.5.0.jar:3.5.0]
Analyzing source code of Spark leads to the following code snippets that causes the error:
(the following are part of Spark 3.5.0 source code: AbstractCommandBuilder.scala, line 227)
String getScalaVersion() {
String scala = getenv("SPARK_SCALA_VERSION");
if (scala != null) {
return scala;
}
String sparkHome = getSparkHome();
File scala213 = new File(sparkHome, "launcher/target/scala-2.13");
checkState(scala213.isDirectory(), "Cannot find any build directories.");
return "2.13";
// ...
}
The intention of this function is to ensure the existence of "SPARK_HOME/launcher/target/scala-2.13" to ensure that the deployed Spark is compiled using the same Scala version. Unfortunately, this directory only exists on Spark project, the binary version of Spark doesn't have it:
Should this function be improved to be compatible with both distributions?
UPDATE 1: Thanks a lot for Anish's suggestion that Spark distribution doesn't contain Scala binary. But in fact they do:
This could be a more reliable evidence to determine Scala version, but at this moment, it wasn't used.






The Spark code at
org.apache.spark.launcher.AbstractCommandBuilder#getScalaVersion()commes from commit 2da6d1a and PR 43125, with PR SPARK-32434 before that.That seems pretty much hard-coded, which means before launching your Spark application, you need to set the
SPARK_SCALA_VERSIONenvironment variable to the Scala version you are using. That should bypass the directory check that is failing ingetScalaVersion().