Azure Databricks API, cannot add repos using service principal and API calls

5.9k Views Asked by At

I need to add Azure DevOps repos to azure databricks repo by using databricks API at this link. I am using a service principal credentials for this. The service principal is already added as admin user to databricks. With my service principal I can get the list of repos and even delete them. But when I want to add a repo to a folder, it raises the following error:

{
    "error_code": "PERMISSION_DENIED",
    "message": "Missing Git provider credentials. Go to User Settings > Git Integration to add your personal access token."
}

I am not using my own credentials to use a PAT token, instead I am getting a bearer token by sending request to https://login.microsoftonline.com/directory-id/oauth2/token and use it to authenticate. This works for get repos, delete repos and get repos/repo-id. Just for creating a repo (adding repo by using post method to /repos) it is failing.

If I still use a PAT instead of bearer token, I get the following error:

{
    "error_code": "PERMISSION_DENIED",
    "message": "Azure Active Directory credentials missing. Ensure you are either logged in with your Azure 
    Active Directory account or have setup an Azure DevOps personal access token (PAT) in User Settings > Git Integration. 
    If you are not using a PAT and are using Azure DevOps with the Repos API, you must use an AAD access token. See https://learn.microsoft.com/en-us/azure/databricks/dev-tools/api/latest/aad/app-aad-token for steps to acquire an AAD access token."
}

I am using postman to construct the requests. To generate the error I am getting I am using the following:

method: post

url-endpoint: https://adb-databricksid.azuredatabricks.net/api/2.0/repos

body:

url: azure-devops-repo
provider: azureDevOpsServices
path: /Repos/folder-name/testrepo

header:

Authorization: Bearer eyJ0eXAiOiJKV1QiLCJhbG... (Construct it by appending bearer token to key wor "Bearer")
X-Databricks-Azure-SP-Management-Token: management token (get it like bearer token by using resource https://management.core.windows.net/)
X-Databricks-Azure-Workspace-Resource-Id: /subscriptions/azure-subscription-id/resourceGroups/resourcegroup-name/providers/Microsoft.Databricks/workspaces/workspace-name

Here the screen shot of the postman:

header request

body request

Please note that I have used exactly same method of authentication for even creating clusters and jobs and deleting repos. Just for adding and updating repos it is failing. I'd like to know how I can resolve the error PERMISSION_DENIED mentioned above.

3

There are 3 best solutions below

11
On BEST ANSWER

Have you setup the git credentials using this endpoint before creating the repo through the API ?

https://docs.databricks.com/dev-tools/api/latest/gitcredentials.html#section/Authentication

If you do not setup this first, you can get the error when trying to create a repo. enter image description here

Listing & deleting a repo only require a valid authentication to Databricks (Bearer token or PAT) and doesn't require valid git credentials. When trying to create a repo, you need authorizations on the target repository that is on Azure Devops in your case.

So you need to call the git-credentials endpoint (it's the same syntax on AWS and Azure) to create it. enter image description here

Once your git credentials up-to-date, the creation of the repo should work as intended. enter image description here

2
On

I have followed the below steps, it is working for me.

  1. Get access token for the service principal with Databricks Scope - 2ff814a6-3304-4ab8-85cb-cd0e6f879c1d/.default - this token will be used for all api calls for authorization.
  2. Get access token for the service principal with Devops scope - 499b84ac-1321-427f-aa17-267ca6975798/.default - this token will be used only for creating a git credentials entry
  3. ADB Rest API - POST - Create a credential entry. Use first token for authorization and second token for personal_access_token in body https://docs.databricks.com/api/workspace/gitcredentials/create
  4. Create a repo - https://docs.databricks.com/api/workspace/repos/create You should have the repo created for the service principal in Databricks with default branch code downloaded.
  5. To update a repo to different branch- Update a repo api call- https://docs.databricks.com/api/workspace/repos/update
5
On

To make service principal working with Databricks Repos you need following:

  • Create an Azure DevOps personal access token (PAT) for it - Azure DevOps Git repositories don't support service principals authentication via AAD tokens (see documentation). (The service connection for SP that you configured is used for connection to other Azure services, not to the DevOps itself).

  • That PAT needs to be put into Databricks workspace using Git Credentials API - it should be done when configuring first time or when token is expired. When using this API you need to use AAD token of the service principal. (btw, it could be done via Terraform as well)

  • After it's done, you can use Databricks Repos APIs or databricks-cli to perform operations with Repos - create/update/delete them. (see previous answer on updating the repo)