Databricks with python 3 for Azure SQl Databas and python

958 Views Asked by At

I am trying to use Azure Databricks in order to :

1- insert rows into table of Azure SQL Databse with python 3. I cannot see a documentation about insert rows. (I have use this link to connect to the database Doc and it is working).

2- Save Csv file in my datalake

3- Create Table from Dataframe if possible

Thanks for your help and sorry for my novice questions

1

There are 1 best solutions below

0
On

**1- insert rows into table of Azure SQL Databse with python 3. **

Azure Databricks has installed the JDBC driver. We can use JDBC driver to write data to SQL Server with a Dataframe. For more details, please refer to here.

For example

jdbcHostname = "<hostname>"
jdbcDatabase = ""
jdbcPort = 1433
jdbcUrl = "jdbc:sqlserver://{0}:{1};database={2}".format(jdbcHostname, jdbcPort, jdbcDatabase)
connectionProperties = {
  "user" : jdbcUsername,
  "password" : jdbcPassword,
  "driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver"
}

#write
df=spark.createDataFrame([(1, "test1"),(2,"test2")],["id", "name"])
df.write.jdbc(url=jdbcUrl,table="users",mode="overwrite",properties=connectionProperties)

#check

df1 = spark.read.jdbc(url=jdbcUrl, table='users', properties=connectionProperties)
display(df1)

.

2- Create Table from Dataframe

If you want to create a DataBricks table from datafarme, you can use the method registerTempTable or saveAsTable.

registerTempTable creates an in-memory table that is scoped to the cluster in which it was created. The data is stored using Hive's highly-optimized, in-memory columnar format.

saveAsTable creates a permanent, physical table stored in S3 using the Parquet format. This table is accessible to all clusters including the dashboard cluster. The table metadata including the location of the file(s) is stored within the Hive metastore.

For more details, please refer to here and here.