Query SQL Server from Spark Scala - How to?

4.2k Views Asked by At

Env: Spark 1.6 with Scala, Cloudera
SQL Server 2012, version 11.0

I am trying to query SQL Server from Spark.

object ConnTest extends App {
  val conf = new SparkConf()
  val sc = new SparkContext(conf.setAppName("Spark Ingestion").setMaster("local[*]"))
  val sqlcontext = new SQLContext(sc)

    val prop=new Properties()
    val url2="jdbc:sqlserver://xxx.xxx.xxx:1511;user=username;password=mypassword;database=SessionMonitor"
    prop.setProperty("user","username")
    prop.setProperty("password","mypassword")
    val test=sqlcontext.read.jdbc(url2,"Service",prop)

  val dd = sqlcontext.sql("select count(*) as TOT from Service")
  dd.foreach(println)
}

My pom.xml has dependencies-

<!-- https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc -->
        <dependency>
            <groupId>com.microsoft.sqlserver</groupId>
            <artifactId>mssql-jdbc</artifactId>
            <version>6.1.0.jre8</version>
        </dependency>

I did not download any jar file; nor install jar to maven repository nor add jar to the class path. My Hadoop cluster does not have a connection to the internet. After creating the maven package, I tried to submit using

spark-submit --class ConnTest /Hadoopshare/tmp/sqldb-1.0-SNAPSHOT.jar

Error:

Exception in thread "main" java.sql.SQLException: No suitable driver

1

There are 1 best solutions below

3
On BEST ANSWER

This should be added to your code:

prop.setProperty("driver" , "com.mysql.jdbc.Driver")

In my case i used this and it totally worked fine:

val jdbcDF = sqlContext.read
      .format("jdbc")
      .option("driver" , "com.mysql.jdbc.Driver")
      .option("url", "jdbc:mysql://<<>Servername>:3306/<<DatabaseName>>")
      .option("dbtable", "(SELECT id, name FROM partner) tmp")
      .option("user", "username")
      .option("password", "******")
      .load()

Hope this should work.