MS PowerPlatform: Use data inside a Azure data lake via a Flow

1.5k Views Asked by At

Edit: Still nowhere closer to an answer. Help appreciated...

My company has some simple lists with data I need and we already have these lists in the .parquet format inside a data lake. (Edit: Gen2)

I'm supposed to build a small PowerApp that uses some of the information inside these lists but I can't figure out the correct way to get the content of them via a Flow.

There's a connector "Azure Blob Storage: Get Blob Content" which sounds like the right one and indeed outputs a cryptic content string. But how do I get from this to an actually readable table where I can use the items? Or is this the wrong connector for this?

(Very new to all this Microsoft stuff. Don't really know anything about how this data lake is set up etc. Not sure whether this helps but basically the following Python script works and is exactly what I need to do via a Flow so it can be done automatically daily:)

import os
from io import BytesIO
import pandas as pd
from azure.storage.blob import BlobServiceClient, BlobClient
from azure.storage.blob import ContentSettings, ContainerClient

blob = BlobClient.from_connection_string(MY_CONNECTION_STRING, "myContainer", "myFile.parquet")
df = pd.read_parquet(BytesIO(blob.download_blob().readall()))

Thanks for any help :)

To clarify: By no means do I have to use this exact process. If you tell me "The standard way is to build a python REST Api on top of the data lake that answers this, that's perfectly fine. I just need to know the easiest and most standard way to access data inside a data lake)

2

There are 2 best solutions below

1
On

If the Azure Data Lake is Gen1, then you don't need Power Automate to access it. You can go directly from a canvas app PowerApp using the Azure Data Lake connector.

https://learn.microsoft.com/en-us/connectors/azuredatalake/

you can call the ReadFile operation to read the contents of your file. this returns a binary which you then just convert to a string and work from there.

0
On

Since your data is in ADLS Gen2, I don't think you've got a direct connector that can transform Parquet data in Power Automate for this.

I would not bother ingesting and transforming parquet files in automate as you're bordering on a requirement for an ETL tool at this stage.

I would look at transforming the file/s using Azure Data Factory Pipelines and maybe dumping them in a database of your choice. Then power automate can pick it from there as it's got connectors to most databases. You could also convert them to CSV files if database is overkill for your setup.

I would use power automate as my orchestration layer. It will call this data factory pipeline > wait for it to complete > pick it from there.

https://learn.microsoft.com/en-us/azure/data-factory/format-parquet

https://learn.microsoft.com/en-au/connectors/azuredatafactory/#actions