Copying storage data from one Azure account to another and organize into date folders

82 Views Asked by At

azcopy copy ' https://mysourceaccount.blob.core.windows.net/mycontainer' ' https://mydestinationaccount.blob.core.windows.net/mycontainer' --recursive

The above will copy everything from one storage account to another. My requirement is that data should be organized in the target folder based on the last modified date of the files in the source container. The source container is one big directory with lots of files.

Suppose the files are like this in the source

https://mysourceaccount.blob.core.windows.net/mycontainer/file1 with last modified date of 2023-12-13

https://mysourceaccount.blob.core.windows.net/mycontainer/file2 with last modified date of 2023-12-14

https://mysourceaccount.blob.core.windows.net/mycontainer/file3 with last modified date of 2023-12-15

The target directory should be copied like below and this would involve creating the date path based on the source last modified date.

https://mydestinationaccount.blob.core.windows.net/mycontainer/2023/12/13/file1

https://mydestinationaccount.blob.core.windows.net/mycontainer/2023/12/15/file2

https://mydestinationaccount.blob.core.windows.net/mycontainer/2023/12/15/file3

1

There are 1 best solutions below

2
On

Copying storage data from one Azure account to another and organize into date folders. You need to use get metadata activity. follow below steps:

  1. First create the dataset for the blob storage to use in pipeline.
  • First dataset is to get the files from source dataset. enter image description here
  • Second dataset is to get the last modified date of particular file. create filename parameter and add it in dataset connection. enter image description here enter image description here
  • Third dataset is to create folders on the basis of last modified date in Destination container. create filename, folderpath parameters and add it in dataset connection. enter image description here enter image description here
  1. Now create a pipeline and take Get metadata activity to get the list of files from source container here add first dataset we created and Field list as Child items enter image description here
  2. Now take Foreach activity and pass the result of Get metadata activity to Foreach activity with expression @activity('Get Metadata1').output.childItems. enter image description here
  3. Under this Foreach activity take another Get metadata activity to get lastmodified date of the file. Here add second dataset we created and add dataset parameter as @item().name and Field list as Last modified enter image description here
  4. Now take copy activity to copy data from source to sink with organized folders based on last modified date. Here add second dataset we created in source with dataset parameter as @item().name and third dataset we created in sink and with dataset prameter for filename is @item().name and folderpath is @formatDateTime(activity('Get Metadata2').output.lastModified,'yyyy/MM/dd') -- Source: enter image description here--Sink: enter image description here

Output:

enter image description here