How to query Athena string to date in hadoop.hive.serde2.OpenCSVSerde format

4.4k Views Asked by At

Hi I have created following schema in AWS Athena to read csv file from AWS S3

CREATE EXTERNAL TABLE IF NOT EXISTS axlargetable.mine (
  createdate string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   'separatorChar' = ',',
   'quoteChar' = '\"',
   'escapeChar' = '\\'
   ) 
LOCATION 's3://ax-large-table/test/'
TBLPROPERTIES ('has_encrypted_data'='false')

"createdate" column holding following values

7/01/2017 5:40:05 PM
7/03/2017 5:40:05 PM
7/04/2017 5:40:05 PM
7/05/2017 5:40:05 PM
7/06/2017 5:40:05 PM
7/31/2017 5:40:05 PM
7/31/2017 5:40:05 PM
7/31/2017 5:40:05 PM
7/31/2017 5:40:05 PM

How can I query "createdate" column value as date format? sample query like

Select * from axlargetable.mine 
where createdate between '7/03/2017' and '7/31/2017'

Thanks

1

There are 1 best solutions below

5
On

I'm afraid the OpenCSVSerDe does not support parsing date types with some non-ISO formatting. The only way you can circumvent this behavior is, that you translate the string into a date within your query.

In that case, you would need to use the parse_date function.

select * from axlargetable.mine
where date_parse(createdate, '%m/%d/%Y %h:%i:%s %p') 
          between DATE'2017-03-07' and DATE'2017-07-31';