I have a dataset that looks like this:
datetime count
18:28:20.602 UTC DEC 08 2016 1
20:42:32.017 UTC DEC 08 2016 5
15:33:40.691 UTC DEC 08 2016 1
17:11:54.008 UTC DEC 08 2016 3
20:28:57.861 UTC DEC 08 2016 0
.
.
.
.
The datetime column is in the string format. I'm having difficulty in converting it to a timestamp.
How do I write a Impala/Hive query so that I get the data between '18:28:00.000 UTC DEC 08 2016' to '18:33:00.000 UTC DEC 08 2016'
With Hive:
...will translate your shitty String format into a UNIX timestamp, then into String standard format (in local timezone because that's the Hive convention), then into a Timestamp.
There is no easier way, unfortunately. And you may have some edge cases because of the 1h overlap in summer/winter times.
Source: the Hive documentation, of course...
With Impala (which does not support the
zzz
format modifier):...will translate your shitty String format into a UNIX timestamp, assuming that all your inputs are in UTC, then into String standard format (in UTC timezone because that's the Impala convention), then into a Timestamp.