R:Converting from Character to POSIXct loses hours and minutes

2.8k Views Asked by At

Good morning,

I'm trying to convert from character to POSIXct, but when I do, I lose the hours and minutes from the data.

hourlyData (dataframe)

    Login   Expo    EquityUSD   Period                    UnrealizedProfitUSD
1   252957  0.00    7.187185    2014-02-03 00:00:00.000     0.00000
2   252957  0.00    7.187772    2014-02-03 01:00:00.000     0.00000
3   252957  0.00    7.188198    2014-02-03 02:00:00.000     0.00000
4   252957  0.00    7.187825    2014-02-03 03:00:00.000     0.00000
5   252957  0.00    7.187079    2014-02-03 04:00:00.000     0.00000
6   252957  0.00    7.187079    2014-02-03 05:00:00.000     0.00000
7   252957  0.00    7.188731    2014-02-03 06:00:00.000     0.00000
8   252957  0.00    7.186279    2014-02-03 07:00:00.000     0.00000
9   252957  0.00    7.187185    2014-02-03 08:00:00.000     0.00000

when I type class(hourlyData$Period) I get Character. When I try and convert this column as such hourlyData$Period = as.POSIXct(hourlyData$Period), however, I get the following output:
hourlyData

    Login   Expo    EquityUSD   Period       UnrealizedProfitUSD
1   252957  0.00    7.187185    2014-02-03   0.00000
2   252957  0.00    7.187772    2014-02-03   0.00000
3   252957  0.00    7.188198    2014-02-03   0.00000
4   252957  0.00    7.187825    2014-02-03   0.00000
5   252957  0.00    7.187079    2014-02-03   0.00000
6   252957  0.00    7.187079    2014-02-03   0.00000
7   252957  0.00    7.188731    2014-02-03   0.00000
8   252957  0.00    7.186279    2014-02-03   0.00000
9   252957  0.00    7.187185    2014-02-03   0.00000

Where the hours and minutes have been removed from the Period column. Does anyone know why this happens, or how to prevent it?

Thanks

Mike

3

There are 3 best solutions below

1
On

I'd use strptime:

char_data = c('2014-02-03 00:00:00.000', '2014-02-03 01:00:00.000', '2014-02-03 02:00:00.000')
strptime(char_data, '%Y-%m-%d %H:%M:%S')
[1] "2014-02-03 00:00:00 CET" "2014-02-03 01:00:00 CET"
[3] "2014-02-03 02:00:00 CET"
0
On

The other answers hint at the problem but don't really address it. as.POSIXct(...) has bizarre behaviour when a character vector is passed with an invalid time: rather than returning NA for those elements with invalid time, as.POSIXct(...) removes the time part for all elements.

You can "fix" this by explicitly providing the format specification, even though you are using the default spec (see last line below).

x <- sprintf('%02d:00:00', 20:25)          # 25:00:00 is not a valid time spec.
y <- sprintf('%s %s', '2018-01-01',x)      # last element has invalid time
as.POSIXct(head(y,-1))                     # works fine
## [1] "2018-01-01 20:00:00 HST" "2018-01-01 21:00:00 HST" "2018-01-01 22:00:00 HST" "2018-01-01 23:00:00 HST" "2018-01-02 00:00:00 HST"
as.POSIXct(y)                              # fails miserably
## [1] "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST" "2018-01-01 HST"
as.POSIXct(y, tz='UTC')                    # tz does not fix this...
## [1] "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC" "2018-01-01 UTC"
as.POSIXct(y, format='%Y-%m-%d %H:%M:%S')  # but this does...
## [1] "2018-01-01 20:00:00 HST" "2018-01-01 21:00:00 HST" "2018-01-01 22:00:00 HST" "2018-01-01 23:00:00 HST" "2018-01-02 00:00:00 HST" NA 

Running R 3.4.0 on Win 7 x64.

0
On

"2014-03-31 03:00:00" is not a valid datetime, where you live, as "2014-03-31 02:59:59" was immediately followed by "2014-03-31 04:00:00" due to switching to daylight saving time.