i have a azure blob data storage with 8 .csv files in it. i want to get all of the files to auto create a table in the azure sql database, and if exists truncate the tables and fill them again when the file is renewed.
here is what i have till now
- get meta data activity - to get all the files from the storage.
- for each (for each item of the meta data activity)
what is in the for each?
- set variable (table name minus the extension .csv)
- lookup function (to get all the meta data from the database like delimiter and azure function)
- copy data activity to get all the files from the data blob to the sql data base with auto create if not exists.
my goal is to make a dynamical pipeline to get all the data files from the blob to the sql data base. and when a new file is brought to the storage the pipeline only has to run again to put it in the sql data base.
here are som examples of the files
01_data.csv Jaar,Special,Data 2023,cd,september 2022,ps,oktober
02_data.csv JAAR,mechanic,Data 2021,paus,march 2000,klank,januari
top100.csv number, id, place,Jaar 1,1234,99,2000
do you get the picure?
I get the error of failure happened on the sink side of the copy data activity.
i tryed a lot but nothing works zo can anyone help?
SQL doesn't support table names starting with numbers like yours. So, you need to cross check those.
You can use Auto-create table and pre-copy script in copy activity sink to achieve your requirement.
But In this case, you need to use two kinds of pipelines. One for the existing files which will run only once on start manually.
The second pipeline is for the newly creating files, which will execute by the storage event trigger.
Follow the below approach for the first pipeline.
These are my starting source files.
I have
sample2
table with existing data.First use Get meta data activity to get the files list and give that files array to the ForEach activity.
Inside ForEach, extract the table name from the file name.
Take the copy activity and use dataset parameter for the file name of the source file. Give the SQL dataset as sink. Use dataset parameter for the table name like below.
In the copy activity sink, give your table name which is stored in the variable to the parameter. Click on the Auto create table and give the below code in the pre-copy script.
Now, debug the pipeline and you can see the data in
sample2
was truncated and new data inserted and other two tables were created.Now for the second pipeline, Use the storage event trigger. Get the newly created file name from the trigger parameter
@triggerBody().fileName
.Use the same copy activity with pre-copy script as the above but don't use any loop as we are not dealing with multiple files this time.
Go through this SO answer to understand about the Storage event trigger.