Write a file to Azure Blob File Share line by line with ShareFileClient in C# Azure Function

1.2k Views Asked by At

I want to upload / create a file in an Azure Blob File Share step by step as I don't know the total amount of data to write (reading from a database line by line):

ShareClient share = new ShareClient(targetBlobStorageConnectionString, directoryName);
share.CreateIfNotExists();
ShareDirectoryClient directory = share.GetDirectoryClient("");
ShareFileClient file = directory.GetFileClient(fileName);

The problem is: If I create the file (or want to write it with overwrite=true) I've to state the maxLength - that I don't known.

var fileStream = file.OpenWrite(overwrite:false,position:0);
...
while(read) {
    var data = System.Text.Encoding.UTF8.GetBytes(GetNewLine(...)+ "\n");
    fileStream.Write(data, 0, data.Length);
}
fileStream.Close();

I hoped to resize the file with every line I write, but this isn't possible with a ShareFileClient.

Are there any other ways to write a file to Azure Blob File Share line by line?

1

There are 1 best solutions below

0
On

ShareFileClient.OpenWrite() signature is:

public virtual System.IO.Stream OpenWrite (
  bool overwrite, 
  long position, 
  Azure.Storage.Files.Shares.Models.ShareFileOpenWriteOptions options = default,
  System.Threading.CancellationToken cancellationToken = default
);

Not sure where are you getting this maxLength from. Second param is "The offset within the blob to begin writing from." not maxLength.

So pseudo code would be:

file = directory.GetFileClient(fileName);

stream = file.OpenWrite(position=0)
byte[] hello = b'hello'
stream.write(hello)
stream.close()

num_bytes_in_hello = size(hello)
stream = file.OpenWrite(position=num_bytes_in_hello)
byte[] world = b'world'
stream.write(world)
stream.close()

That said, this is a very expensive/bad idea. Don't underestimate the cost per REST API call.

Say:

  • you have 1M (M=million) rows in your DB
  • each row you write is 2KB

With your code you'll pay 1,000,000/10,000 * 2 * 0.065 = $13. These add up when you do this a few times day and every day of the month. (2 calls per row: write() + close() probably calls flush REST API)

Instead if you were to write in chunks of 100 rows per write, you'll pay $0.13.

Actual number would vary based on region/tier etc. but scale remains the same.