How to check date format in Azure Data Factory

3.8k Views Asked by At

I am creating a pipeline where the source is csv files and sink is SQL Server.

The date column in CSV file may have values like

12/31/2020
10162018
20201017
31/12/1982
1982/12/31

I do not find the function which checks the format of the date. How do I check the format and convert the above values to yyyy-MM-dd format.

3

There are 3 best solutions below

0
On BEST ANSWER

The solution is given by HimanshuSinha-msft

Solved the issues using expression builder in Derived Column in Mapping Data Flow.

coalesce(toDate(Somedate,'MM/dd/yyyy'),toDate(Somedate,'yyyy/MM/dd'),toDate(Somedate,'dd/MM/yyyy'),toDate(Somedate,'MMddyyyy'),toDate(Somedate,'yyyyddMM'),toDate(Somedate,'MMddyyyy'),toDate(Somedate,'yyyyMMdd'))
0
On

You can configure this in the Mapping tab of your copy activity. The datetime format can be specified, but it only supports one format type. If you have a mix of formats like in your example then it will not work.

One option would be to ingest the column into a staging table as a nvarchar. Then in another copy activity use a custom select statement to detect the column format and cast the date as needed. You should be able to do this using a CASE SQL statement in your SELECT from the staging table.

FYI: data type mapping

https://learn.microsoft.com/en-us/azure/data-factory/copy-activity-schema-and-type-mapping#data-type-mapping

0
On

This coalesce function answer will not actually solve the problem. It just gets rid of the errors. There are plenty of dates that are valid in multiple formats. For example: "2/1/2020" (mm/dd/yyyy) and "1/2/2020" (dd/mm/yyyy). The previous answer just gets rid of errors, but your analyses downstream will be very incorrect.

You need to do an aggregate analysis of which date format best fits the incoming stream, and the route the logic to the respective separate pipeline branches.