Env: Spark 1.6 with Scala, Cloudera
SQL Server 2012, version 11.0
I am trying to query SQL Server from Spark.
object ConnTest extends App {
val conf = new SparkConf()
val sc = new SparkContext(conf.setAppName("Spark Ingestion").setMaster("local[*]"))
val sqlcontext = new SQLContext(sc)
val prop=new Properties()
val url2="jdbc:sqlserver://xxx.xxx.xxx:1511;user=username;password=mypassword;database=SessionMonitor"
prop.setProperty("user","username")
prop.setProperty("password","mypassword")
val test=sqlcontext.read.jdbc(url2,"Service",prop)
val dd = sqlcontext.sql("select count(*) as TOT from Service")
dd.foreach(println)
}
My pom.xml
has dependencies-
<!-- https://mvnrepository.com/artifact/com.microsoft.sqlserver/mssql-jdbc -->
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>mssql-jdbc</artifactId>
<version>6.1.0.jre8</version>
</dependency>
I did not download any jar
file; nor install jar
to maven repository nor add jar
to the class path. My Hadoop cluster does not have a connection to the internet. After creating the maven package, I tried to submit using
spark-submit --class ConnTest /Hadoopshare/tmp/sqldb-1.0-SNAPSHOT.jar
Error:
Exception in thread "main" java.sql.SQLException: No suitable driver
This should be added to your code:
In my case i used this and it totally worked fine:
Hope this should work.