Azure Storage account upload throttling for millions of files

128 Views Asked by At

I have a VM on which I have mounted 4 disk of 4 TB each (Disk type is StandardSSD_LRS). Each disk is full. There are 2 million + files and around 16TB of data in total.

I want to store all that data in azure storage account in a single container. So I created a storage account and a SAS url. I am using this sas url to authorize the account.

This VM is only being used to upload this data to azure storage account and for no other work. Also the storage account does not contain any other data.

The issue is that after uploading certain amount of data, the upload performance drops drastically. I have used azcopy and rclone to upload and in both cases I see the similar trend. In azcopy case, I also saw an addition OOM issue which I did not see with rclone.

How can i get a consistent performance in uploading all this data? Or any other alternative I way can upload this much data to azure blob storage?

AzCopy command: azcopy sync $SRC_ROOT $container_uri --recursive

Rclone command: rclone copy $SRC_ROOT az:${storage_container_name} --config rclone.conf -v

enter image description here

enter image description here

enter image description here

EDIT:

  1. AzCopy does not work in my case as with this much volume, the AzCopy crashes with OOM error even before any transfer starts.

  2. With Rclone, I only allow 4 parallel transfer with a chunk size of 4MB. This is the default config in rclone and I can see that in the VM metrics, the resource consumption is consistent. As I mentioned, the only purpose of this VM is to upload the data to Blob Storage so no other process is consuming any resources (Network, IOPS, etc). Also From the graphs, we can see that the there is some throttling happening in the storage account side.

  3. The number of transfer happening decreases after a certain amount of data has been uploaded. So I do not think that it is a library issue. Rather some throttling is happening on the azure blob storage side as for almost hours I get max performance without any drop and after that it experience a sudden drop.

What I want to know is why this drop in upload performance because there is no other factor which have any impact on either the Storage account or the VM machine.

1

There are 1 best solutions below

0
On

Too be honest, i don't know if it's an issue from your VM / Network or storage account.

But I'd check on your VM, if you reach any CPU / RAM-Limit when uploading. Maybe the azcopy consumes VM-Performance event after the finished upload for each file