Connecting SparkR with Redshift: Failed to find data source: com.databricks.spark.redshift

616 Views Asked by At

I have an Spark cluster setup with Amazon EMR with RStudio installed on top of it. I am trying to connect sparkR with Redshift through the package spark-redshift_2.11-0.5.0.jar during which I am facing the error failed to find the data source: com.databricks.spark.redshift

I have placed the spark-redshift_2.11-0.5.0.jar in the location /usr/lib/spark/jars where all other spark jar files are present. I using code snippet from the "Reading data using R:" section of the github repo https://github.com/databricks/spark-redshift

.libPaths(c(.libPaths(), '/usr/lib/spark/R/lib'))
Sys.setenv(SPARK_HOME = '/usr/lib/spark') 
library(SparkR, lib.loc = c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib")))
sc <- sparkR.init(master = "local[*]", sparkEnvir = list(spark.driver.memory="50g"))
sqlContext <- sparkRSQL.init(sc) 
sc <- sparkR.init(master = "local[*]", sparkEnvir = list(spark.driver.memory="5g",spark.driver.library.path="/usr/lib/spark/jars"))
sc <- sparkR.init(sparkPackages="com.databricks:spark-redshift_2.11:0.5.0")
df <- read.df(NULL,"com.databricks.spark.redshift",tempdir = "s3n://location",dbtable = "schemaname.tablename",url ="redshift://hostname:5439/dbname?user=user&password=pwd")

I expected the code to pull the data from the redshift and hold it in the dataframe. but facing the below issues:

Caused by: java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.redshift. Please find packages at http://spark.apache.org/third-party-projects.html
    at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:657)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167)
    ... 36 more
Caused by: java.lang.ClassNotFoundException: com.databricks.spark.redshift.DefaultSource
    at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
0

There are 0 best solutions below