How can I read data from Cassandra Datastax in spark 2.0?
This is what I tried -
val df = spark.read.format("org.apache.spark.sql.cassandra").options(Map("keyspace" -> "my_keyspace",
"table" -> "my_table",
"spark.cassandra.connection.config.cloud.path" -> "file:///home/training/secure-connect-My_path.zip",
"spark.cassandra.auth.password" -> "password",
"spark.cassandra.auth.username" -> "Username"
))
.load()
I'm getting this error:
Exception in thread "main" java.lang.ClassNotFoundException: Failed to find data source: org.apache.spark.sql.cassandra. Please find packages at http://spark.apache.org/third-party-projects.html
When I'm using datastax zip why do I need to install Cassandra or do any additional step?
Using the same zip file , I can read data in java program. Why am I unable to read into Spark?
You're on the right track. If you were connecting from a Spark shell, you would pass the details like this:
Then your code would look something like:
For details, see the Spark Cassandra Connector documentation on connecting to Astra. There's also this blog post from Alex Ott, "Advanced Apache Cassandra Analytics Now Open For All". Cheers!