ADF dataflow creates blobs with 0 commited blocks in ADLS gen 2

145 Views Asked by At

I'm creating files in ADLS Gen 2 account using ADF dataflow. The blob type is shown as "Block blob" in the storage UI. The file content looks good. However, while trying to read the blob programmatically, I get the committed blocks as 0. This issue is happening after upgrading the blob storage to ADLS Gen 2. Not sure if this is ADF issue or ADLS gen 2 issue.

Sample code

string connectionString = "";
string containerName = "test";
string blobName = "test.json";

BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);

BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);

BlockBlobClient blobClient = containerClient.GetBlockBlobClient(blobName);

var committedBlocks = await blobClient.GetBlockListAsync(BlockListTypes.All);

foreach (var block in committedBlocks.Value.CommittedBlocks)
{
    Console.WriteLine($"Block Name: {block.Name}");
}
2

There are 2 best solutions below

0
On BEST ANSWER

After trying multiple things, only SAS auth from ADF to Gen 2 allows creation of blobs with committed blocks. This works only if you use storage blob linked service, not ADLS Gen 2 linked service. This looks like a bug or design flaw on ADF. This information should be publicly documented so that customers are aware that ADF-ADLS Gen2 integration is inefficient to process large blobs.

2
On

To read the content of a block blob in ADLS Gen2, you may not need to list committed blocks as you would with Azure Blob Storage. You can simply download the blob's content directly, like this:

using Azure.Storage.Blobs;
using Azure.Storage.Blobs.Models;

string connectionString = "<connectionString>";
string containerName = "<containerName>";
string blobName = "<blobName>";

BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(containerName);
BlobClient blobClient = containerClient.GetBlobClient(blobName);

BlobDownloadInfo blobDownloadInfo = await blobClient.DownloadAsync();
using (StreamReader reader = new StreamReader(blobDownloadInfo.Content))
{
    string content = await reader.ReadToEndAsync();
    Console.WriteLine("Blob Content:");
    Console.WriteLine(content);
}

It will read the blob content as shown below:

enter image description here

  • Confirm that the ADF dataflow or activity responsible for creating the blob in ADLS Gen2 is correctly configured and completes without errors. It may be one of the reasons for getting zero blocks.