Blob Storage - handle files directly on it

486 Views Asked by At

I desire an efficient solution to iterate through large files located on blob storage in C#. I want to use the blob storage as a local storage. In my research I've only found solutions where the file has to be downloaded or streamed.

Is it possible to directly iterate through a file on blob storage without downloading it locally or to a stream?

3

There are 3 best solutions below

0
On

Is it possible to directly iterate through a file on blob storage without downloading it locally or to a stream?

Simple answer: No. You can list the blobs, i.e. their properties. But if you want to actually work with the content, you need to download it from the storage account. The storage is a pure object store.

7
On

As already mentioned, this is not possible. Maybe you could split up your files into smaller chunks and up-/downoad these for manipulations. If for example you have a 1MB file, split it up into 10 KB chunks and add an integer suffix to your original name to distinct the individual chunks. If you need a specific part of the file, calculate the needed chunk ids and download these files only.

If this approach works for you, highly depends on your data. Can it be split by size or some other criteria (e.g. number of lines, json elements), does changing the data within one chunk have impact on other chunks (e.g. split by rows and you insert a row into a chunk). So be careful with this approach and choose wisely.

1
On

I have not tried it but I would try using a file share to store the large files. Then create a docker image with your processing code and push it to azure container registry. Then create a container instance using this image and you can mount the file share to that container instance. Azure container instances are only billed for the duration they are running. Using the SDK you start a container instance whenever you need to process the large file.