Unable to set environment variables in Spark using livy and sparkmagic

809 Views Asked by At

Scenario :

I have setup a spark cluster on my kubernetes environment :

  • Livy Pod for submission of jobs
  • Spark Master Pod
  • Spark Worker Pod for execution

What I want to achieve is as follows: I have a jupyter notebook with a Pyspark kernel as a pod in the same environment wherein on the execution of cells a spark session is created and using livy post request /statements all my code gets executed. I was able to achieve the above scenario

Note : There is no YARN, HDFS, Hadoop in my env. I have made use of kubernetes, spark standalone and jupyter only.

Issue : Now what I wanted, is when I run my pyspark code and it gets executed over in the spark worker, I would like to send the following over in that execution environment :

  1. environment variables which I have used in the notebook
  2. pip packages which I have used in the notebook
  3. or a custom virtualenv in which i could provide all the packages used together I am unable to do the same.

Things that I have tried out so far : Since I made use of spark magic, have tried to set environment variables using the following ways I could find in the documentations and other answers.

%%configure { 
"conf": {
    spark.executorEnv.TESTVAR
    spark.appMasterEnv.TESTVAR
    spark.driver.TESTVAR
    spark.driverenv.TESTVAR
    spark.kubernetes.driverenv.TESTVAR
    spark.kubernetes.driver.TESTVAR
    spark.yarn.executorEnv.TESTVAR
    spark.yarn.appMasterEnv.TESTVAR
    spark.workerenv.TESTVAR
   }
}

Bunching up for reference, I have tried the above options individually.

I have tried directly hitting the livy pod's service name like a normal post request but still no luck.

But the variables are still not getting detected

After this I tried directly setting the same manually in spark-defaults.conf in the spark cluster but did not work. Would appreciate any help on the matter. Also is my first SO question so please let know incase of issues.

0

There are 0 best solutions below