How data can be fetched from SQL Server in SparkCLR?
Accessing sql server data into SparkCLR
618 Views Asked by 107 At
2
There are 2 best solutions below
0

You could use the following SparkCLR code as a reference to use C# for loading Spark DataFrame from the data in SQL Server, Azure SQL Database or any other JDBC compliant datasource.
//C# sample to load SQL Server data as Spark DataFrame using JDBC
var sparkConf = new SparkConf();
var sparkContext = new SparkContext(sparkConf);
var sqlContext = new SqlContext(sparkContext);
var dataFrame = sqlContext.Read()
.Jdbc("jdbc:sqlserver://localhost:1433;databaseName=Temp;;integratedSecurity=true;", "xyz",
new Dictionary<string, string>());
dataFrame.ShowSchema();
var rowCount = dataFrame.Count();
Console.WriteLine("Row count is " + rowCount);
Few things to note:
- This sample code uses Microsoft JDBC driver. If you use a different driver or JDBC datasource you need to update the url
- You need to include the driver jar file when submitting your SparkCLR job
SparkCLR project for this sample is available @ https://github.com/Microsoft/SparkCLR/tree/master/examples/JdbcDataFrame
My recommendation is to use JDBC to connect to sql server then query against the Dataframe.