This is my spark configuration for spark-context in Clojure. I am running Spark 2.2.1 and flambo 0.8.2. Flambo is a Clojure DSL for Apache Spark
and flambo-0.8x is compatible with Spark-2x.
(ns some-namespace
(:require
[flambo.api :as f]
[flambo.conf :as conf]))
;; Spark-configurations
(def c
(-> (conf/spark-conf)
(conf/master "spark://abhi:7077")
(conf/app-name "My-app")
;; ***To add all the jars used in the project***
(conf/jars (map #(.getPath %) (.getURLs (java.lang.ClassLoader/getSystemClassLoader))))
(conf/set "spark.cores.max" "4")
(conf/set "spark.executor.memory" "2g")
(conf/set "spark.home" (get (System/getenv) "SPARK_HOME")))
;; Initializing spark-context
(def sc (f/spark-context c))
;; Returns data in `org.apache.spark.api.java.JavaRDD` form from a file
(def data (f/text-file sc "Path-to-a-csv-file"))
;; Collecting data from RDD
(.collect data)
I am able to run this piece of code on master local[*]
. But when I am using spark cluster, I am getting following error while collecting the JavaRDD.
18/04/19 05:50:39 ERROR util.Utils: Exception encountered
java.lang.ArrayStoreException: [Z
at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:90)
at org.apache.spark.util.collection.PrimitiveVector.$plus$eq(PrimitiveVector.scala:43)
at org.apache.spark.util.collection.SizeTrackingVector.$plus$eq(SizeTrackingVector.scala:30)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:216)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1038)
at org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.doPut(BlockManager.scala:969)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1029)
at org.apache.spark.storage.BlockManager.putIterator(BlockManager.scala:792)
at org.apache.spark.storage.BlockManager.putSingle(BlockManager.scala:1350)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:231)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1303)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:206)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:66)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:96)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:81)
at org.apache.spark.scheduler.Task.run(Task.scala:108)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
I am not sure what is going wrong. Have searched about this issue a lot, tried many different things but still getting the very same error. Added all the jars also in spark-configuration. Please let me know if you have any thought about this.