Use Azure Python Function and Managed Identity to Download from Storage Account

6.6k Views Asked by At

I've created an Azure Function called "transformerfunction" written in Python which should upload and download data to an Azure Data Lake / Storage. I've also turned on System assigned managed identity and gave the function the role permissions "Storage Blob Data Contributor" in my storage account:

enter image description here

To authenticate and download a file, I use this part of the code basically following these docs:

managed_identity = ManagedIdentityCredential()
credential_chain = ChainedTokenCredential(managed_identity)
client = DataLakeServiceClient(account_url, credential=credential_chain)

file_client = client.get_file_client(file_system_container, file_name)
downloaded_file = file_client.download_file()
downloaded_file.readinto(f)

If my understanding is correct, Azure should use the identity of the Function for authentication and since this identity has Storage Blob Data Contributor permissions on the storage, the download should work.

However, when I call the function and take a look in the logs, this is what I see:

2020-11-23 20:04:11.396 Function called
2020-11-23 20:04:11.397 ManagedIdentityCredential will use App Service managed identity
2020-11-23 20:04:13.105
Result: Failure Exception: HttpResponseError: This request is not authorized to perform this operation. 
RequestId:1f6a2a1c-b01e-0090-26d3-c1d0c0000000 Time:2020-11-23T20:04:13.0679405Z ErrorCode:AuthorizationFailure Error:None Stack:
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 357, in _handle__invocation_request self.__run_sync_func, invocation_id, fi.func, args)
File "/usr/local/lib/python3.6/concurrent/futures/thread.py", line 56, in run result = self.fn(*self.args, **self.kwargs)
File "/azure-functions-host/workers/python/3.6/LINUX/X64/azure_functions_worker/dispatcher.py", line 542, in __run_sync_func return func(**params)
File "/home/site/wwwroot/shared/datalake.py", line 65, in download downloaded_file = client.download_file()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/filedatalake/_data_lake_file_client.py", line 593, in download_file downloader = self._blob_client.download_blob(offset=offset, length=length, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/core/tracing/decorator.py", line 83, in wrapper_use_tracer return func(*args, **kwargs)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_blob_client.py", line 674, in download_blob return StorageStreamDownloader(**options)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 316, in __init__ self._response = self._initial_request()
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_download.py", line 403, in _initial_request process_storage_error(error)
File "/home/site/wwwroot/.python_packages/lib/python3.6/site-packages/azure/storage/blob/_shared/response_handlers.py", line 147, in process_storage_error raise error

Which pretty clearly says that the Function has no rights to download the blob. But why? What do I have to do differently?

Edit:

I found the cause for the problem: I restricted my Data Lake storage in the network settings like so:

enter image description here

My assumption was that "Allow trusted Microsoft services to access this storage account" will always allow Functions running on Azure to access the storage no matter if or which networks are selected - which is not the case.

1

There are 1 best solutions below

0
On BEST ANSWER

Not sure the reason on your side, but the code below works perfectly for me :

import azure.functions as func
import json
from azure.identity import ChainedTokenCredential,ManagedIdentityCredential
from azure.storage.filedatalake import DataLakeServiceClient



def main(req: func.HttpRequest) -> func.HttpResponse:
    
    MSI_credential = ManagedIdentityCredential()
    
    credential_chain = ChainedTokenCredential(MSI_credential)

    client = DataLakeServiceClient("https://<Azure Data Lake Gen2 account name>.dfs.core.windows.net", credential=credential_chain)

    file_client = client.get_file_client("container name", "filename.txt")
    stream = file_client.download_file()
 
    return func.HttpResponse(stream.readall());

Config for my function MSI: enter image description here enter image description here

Content of my test file: enter image description here

Test result : enter image description here