Databricks PySpark Connect to Azure SQL with Service Principal

Question

Databricks PySpark Connect to Azure SQL with Service Principal

502 Views Asked by Developer KE At 14 September 2023 at 11:29

We would like to connect azure DB from azure databricks notebook, can you please someone help us to provide samples, I saw many of the answer and story but not clear about for me below two points.

what is the service principle id and secrete, where should I get it and if we have AD access, what should I keep inform to my admin to enable for this.

where we can get it the above two point in azure portal.

below stack overflow reference does not have any answer.

https://stackoverflow.com/collectives/azure/articles/75189853/how-to-connect-from-azure-databricks-to-azure-sql-db-using-service-principal

as per above link, below code available, how to use it with secrete credentials .

Note : my requirement is azure databricks, we have new job cluster, it is not a fulltime job cluster. we have to use when we required, our job cluster will be start and stop.

Original Q&A

There are 1 best solutions below

**DileeprajnarayanThumula** · Answer 1 · 2023-09-15T04:10:52.527000

Thanks to @The data swamp. for the excellent documentation

The below approach will help you connect to Azure SQL database.

Step 1

You will need to install the SQL Spark Connector and the Microsoft Azure Active Directory Authentication Library for Python.

In your databricks cluster install com.microsoft.azure:spark-mssql-connector_2.12_3.0:1.0.0-alpha from Maven and adal from PyPI.

enter image description here

In the key vault, generate secrets that represent the values from the app registration.

The Client Id and Client Secret

Create a secret scope to your Key Vault for your client Id, secret, and tenant Id will be generated.

Learn More how to create Secretscope in databricks

Step 2

Create user (SPN) in the database and then grant permissions on the objects.

CREATE USER [thedataswamp-dbr-dev] FROM EXTERNAL PROVIDER  WITH DEFAULT_SCHEMA=[dbo]
GO
GRANT SELECT ON SCHEMA :: dbo TO [thedataswamp-dbr-dev];

Table:

CREATE TABLE Emperor 
(
    Id INT,
    Emperor NVARCHAR(25)
)

INSERT INTO dbo.Emperor (Id, Emperor)
VALUES (1, 'Augustus'),
       (2, 'Tiberius'),
       (3, 'Caligula'),
       (4, 'Claudius'),
       (4, 'Nero');

Step 3 - connecting to Azure SQL database

Retrieving service principal, secret, and tenant id from Key Vault. The database URL, database, and table

import adal
resource_app_id_url = "https://database.windows.net/"
service_principal_id = dbutils.secrets.get(scope = "defaultScope", key = "DatabricksSpnId")
service_principal_secret = dbutils.secrets.get(scope = "defaultScope", key = "DatabricksSpnSecret")
tenant_id = dbutils.secrets.get(scope = "defaultScope", key = "TenantId")
authority = "https://login.windows.net/" + tenant_id
azure_sql_url = "jdbc:sqlserver://thedataswamp.database.windows.net"
database_name = "dev"
db_table = "dbo.Emperor" 
encrypt = "true"
host_name_in_certificate = "*.database.windows.net"
context = adal.AuthenticationContext(authority)
token = context.acquire_token_with_client_credentials(resource_app_id_url, service_principal_id, service_principal_secret)
access_token = token["accessToken"]
emperorDf = spark.read \
             .format("com.microsoft.sqlserver.jdbc.spark") \
             .option("url", azure_sql_url) \
             .option("dbtable", db_table) \
             .option("databaseName", database_name) \
             .option("accessToken", access_token) \
             .option("encrypt", "true") \
             .option("hostNameInCertificate", "*.database.windows.net") \
             .load()
             display(emperorDf)

enter image description here

This code performs connection to Azure SQL retrieves an authentication token from Active Directory application is registered.
Next it creates a df containing emperors using the configuration and authentication token retrieved
Displays df containing emperors

Databricks PySpark Connect to Azure SQL with Service Principal

There are 1 best solutions below

Related Questions in AZURE

Related Questions in PYSPARK

Related Questions in AZURE-SQL-DATABASE

Related Questions in AZURE-DATABRICKS

Related Questions in AZURE-CLOUD-SERVICES

Trending Questions

Popular # Hahtags

Popular Questions