I'm writing a service that runs on a long-running Spark application from a spark submit. The service won't know what jars to put on the classpaths by the time of the initial spark submit, so I can't include it using --jars
. This service will then listen for requests that can include extra jars, which I then want to load onto my spark nodes so work can be done using these jars.
My goal is to call spark submit only once, being at the very beginning to launch my service. Then I'm trying to add jars from requests to the spark session by creating a new SparkConf
and building a new SparkSession
out of it, something like
SparkConf conf = new SparkConf();
conf.set("spark.driver.extraClassPath", "someClassPath")
conf.set("spark.executor.extraClassPath", "someClassPath")
SparkSession.builder().config(conf).getOrCreate()
I tried this approach but it looks like the jars aren't getting loaded onto the executor classpaths as my jobs don't recognize the UDFs from the jars. I'm trying to run this in Spark client mode right now.
- Is there a way to add these jars AFTER a spark-submit has been
called and just update the existing Spark application or is it only possible with another spark-submit that includes these jars using
--jars
? - Would using cluster mode vs client mode matter in this kind of situation?