I'm hitting issues trying to use spark packages, for example:
java.lang.ClassNotFoundException: Failed to find data source: com.mongodb.spark.sql.DefaultSource
I have listed the files in the lib dir:
!find ~/data/libs/
I can see my jars are installed:
/gpfs/fs01/user/xxxx/data/libs/
/gpfs/fs01/user/xxxx/data/libs/scala-2.11
/gpfs/fs01/user/xxxx/data/libs/scala-2.11/mongo-spark-connector_2.11-2.0.0.jar
/gpfs/fs01/user/xxxx/data/libs/scala-2.11/mongo-java-driver-3.2.2.jar
/gpfs/fs01/user/xxxx/data/libs/pixiedust.jar
/gpfs/fs01/user/xxxx/data/libs/spark-csv_2.11-1.3.0.jar
However, the error suggests that spark is unable to see the jar.
How can I list the jars available to spark?
I created a scala notebook and ran the following code:
Attribution: https://gist.github.com/jessitron/8376139
Running this highlighted an issue with the jvm loading the mongodb driver:
This made me realise that although the jar file was present in the correct location was not getting loaded correctly into the jvm.