I have a spark cluster which master is on 192.168.0.60:7077
I used to use jupyter notebook to make some pyspark scripts. I am now willing to move on to scala. I don't know scala's world. I am trying to use Apache Toree. I installed it, downloaded the scala kernels, and runned it to the point to open a scala notebook . Till there everything seems ok :-/
But I can't find the spark context, and there are errors in the jupyter's server logs :
[I 16:20:35.953 NotebookApp] Kernel started: afb8cb27-c0a2-425c-b8b1-3874329eb6a6
Starting Spark Kernel with SPARK_HOME=/Users/romain/spark
Error: Master must start with yarn, spark, mesos, or local
Run with --help for usage help or --verbose for debug output
[I 16:20:38.956 NotebookApp] KernelRestarter: restarting kernel (1/5)
As I don't know scala, I am not sure of the issue here ? It could be :
- I need a spark kernel (according to https://github.com/ibm-et/spark-kernel/wiki/Getting-Started-with-the-Spark-Kernel )
I need to add an option on the server (the error message says 'Master must start with yarn, spark, mesos, or local' )
or something else :-/
I was just willing to migrate from python to scala, and I spend a few hours lost just on starting up the jupyter IDE :-/
It looks like you are using Spark in a standalone deploy mode. As Tzach suggested in his comment, following should work:
SPARK_OPTS
expects usualspark-submit
parameter list.If that does not help, you would need to check the
SPARK_MASTER_PORT
value inconf/spark-env.sh
(7077 is the default).