I'm trying to set custom logging configurations. If I add the log file to the cluster and reference it in my spark submit, the configurations take effect. But if I try to access the file using --files s3://...
then it doesn't work.
Works (assuming I placed the file in the home dir):
spark-submit \
--master yarn \
--conf spark.driver.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties \
--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=file:log4j.properties \
Doesn't work:
spark-submit \
--master yarn \
--files s3://my_path/log4j.properties \
--conf spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties \
--conf spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties \
How can I use a config file in s3 to set the logging configuration?
You can't directly Log4J loads its files from the local filesystem, always.
You can use configs inside a JAR, and as spark will download JARs with your job, you should be able to get it indirectly. Create a JAR containing only the log4j.properties file, tell spark to load it with the job