I have a seemingly simple task to integrate multiple json files that are residing in a data lake gen2 The problem is files that need to be integrated are located in multiple folders, for example this is a typical structure that I am dealing with: Folder1\Folder2\Folder3\Folder4\Folder5\2022\Month\Day\Hour\Minute\ <---1 file in Minute Folder Than same structure for 20223 year, so in order for me to collect all the files I have to go to bottom of the structure which is Minute folder, if I use wildcard path it looks like this: Wildcard paths 'source from dataset"/ *.json, it copies everything including all folders, and I just want files, I tried to narrow it down and copies only first for 2022 but whatever I do is not working in terms of wildcard paths, help is much appreciated
trying different wildcard combinations did not help, obviously I am doing something wrong
There is no option to copy files from multiple sub- folders to single destination folder. Flatten hierarchy as a copy behavior also will have autogenerated file names in target.
Instead, you can follow the below approach.
Give the Base URL in HTTP connector as,
https://<storage_account_name>.blob.core.windows.net/<container>?restype=directory&comp=list.[Replace
<storage account name>and<container>with the appropriate name in the above URL].jsonfrom the lookup activity output.Settings of filter activity:
items:
@activity('Lookup1').output.value[0].EnumerationResults.Blobs.Blobcondition:
@endswith(item().URL,'.json')Output of filter activity
Take the for-each activity next to filter activity and give the item of for-each as
@activity('Filter1').output.valueInside for-each activity, take the copy activity.
Take http connector and json dataset as source, give the base url as
https://<account-name>.blob.core.windows.net/<container-name>/Create the parameter for relative URL and value for that parameter as
@item().name@split(item().name,'/')[sub(length(split(item().name,'/')),1)]This expression will take the filename from relative URL value.