OPENROWSET only accepting VARCHAR/NVARCHAR for a Date column in a CSV, not DATE (Synapse Serverless)

172 Views Asked by At

A CSV file is being dumped into an Azure Datalake. When I try to make an OPENROWSET view in Synapse Serverless, the date column is only working as a VARCHAR/NVARCHAR, not any sort of DATE.

Here's the CSV file with the offending column:

The Date Column of the CSV file, if I download it.

My Synapse Serverless VIEW/SELECT statements works, if I put the column as a VARCHAR/NVARCHAR:

CREATE TestView AS

SELECT  * FROM OPENROWSET(BULK 'Orders Forecast.csv'
 ,DATA_SOURCE = 'workday'
 ,FORMAT = 'CSV'
 ,PARSER_VERSION = '2.0'
 ,HEADER_ROW = TRUE

) 

WITH (
     [Period] varchar(50)
    )
AS t

But then it's not easily usable as a date when querying the VIEW.

If I try to assign it as DateTime2, it errors:

CREATE VIEW TestView AS

SELECT  * FROM OPENROWSET(BULK 'Orders Forecast.csv'
, DATA_SOURCE = 'workday'
, FORMAT = 'CSV'
, PARSER_VERSION = '2.0'
--, ROWSET_OPTIONS = '{"READ_OPTIONS":["ALLOW_INCONSISTENT_READS"]}'
, HEADER_ROW = TRUE

) 

WITH (
     [Period] datetime2
    )

ERROR: Msg 13812, Level 16, State 1, Line 25 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row starting at byte offset 138, column 8 (Period) in Orders Forecast.csv.

I've also tried going down to PARSER_VERSION 1.0, and doing the then-compatible DATETIME. And DATE. Same results.

2

There are 2 best solutions below

2
On BEST ANSWER

You can also try the code below for getting it in datetime format.

When using *openrowset, you should explicitly specify the collation when declaring the column type inside the WITH clause.

    WITH (
        [C1] VARCHAR(20) COLLATE Latin1_General_100_BIN2_UTF8
    ) 

Next, use the TRY_PARSE function.

SELECT TRY_PARSE(c1 AS datetime USING 'en-US') AS result
FROM cte;

Here is the initial data.

Enter image description here

Output:

Enter image description here

0
On

I can wrap the whole thing in a CTE, then run a DATEFROMPARTS on it. This absolutely works. But, is there not a better way?

CREATE Test View AS

WITH CTE AS (

SELECT  * FROM OPENROWSET(BULK 'Orders Forecast.csv'
, DATA_SOURCE = 'workday'
, FORMAT = 'CSV'
, PARSER_VERSION = '2.0'
--, ROWSET_OPTIONS = '{"READ_OPTIONS":["ALLOW_INCONSISTENT_READS"]}'
, HEADER_ROW = TRUE

) 

WITH (
     [Period] varchar(50)
    )
AS t

    )


SELECT

,DATEFROMPARTS(RIGHT(CTE.Period,4),LEFT(CTE.PERIOD,2),1) as PeriodFixed


FROM CTE

GO