I have the spark-master and spark-worker running on SAP Kyma environment (different flavor Kubernetes) along with the Jupyter Lab with ample of CPU and RAM allocation.
I can access the Spark Master UI and see that workers are registered as well (screen shot below).
I am using Python3 to submit the job (snippet below)
import pyspark
conf = pyspark.SparkConf()
conf.setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
sc
and can see the spark context as output of the sc
. After this, I am preparing the data to submit to the spark-master (snippet below)
words = 'the quick brown fox jumps over the lazy dog the quick brown fox jumps over the lazy dog'
seq = words.split()
data = sc.parallelize(seq)
counts = data.map(lambda word: (word, 1)).reduceByKey(lambda a, b: a + b).collect()
dict(counts)
sc.stop()
but it start to log warning messages on notebook(snippet below) and goes forever till I kill the process from spark-master UI.
22/01/27 19:42:39 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
22/01/27 19:42:54 WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
I am new to Kyma (Kubernetes) and Spark. Any help would be much appreciated.
Thanks
For those who stumble upon the same question.
Check your infrastructure certificate. Turned out that the Kubernetes was issuing wrong internal certificate which was not recognised by the pods.
After we fixed the certificate, all started working.