I have a data frame that contains one column that is a series of dates, collected via a Google form. The date and time were collected separately. The data was entered by selecting a day from a calendar, and the date was entered manually - should have been a 24-hour clock, but the field appears to have just checked that the hour and minute were in the correct range.
I've read the file in from .csv
. I converted the date time character field (as read in from the .csv
) to a date time format in a new variable by using as.POSIXct(foo$When, tz="NZ", format="%Y-%m-%d %H:%M")
. The dates and times were correctly constructed.
Except: I have some incorrect date/time entries in the original data. These have all been set to NA
in the new field, as you expect. For those that do include a time, I have been trying to fix them while still retaining a POSIXct
format.
I have been unsuccessful.
Here is an example of the data I have, and what I have tried to do:
TestDataForHelp <- data.frame(OldDateTime =
c("2013-12-04 21:10", "2013-12-15 09:07", "2014-01-01 06:27",
"2014-11-02 21:15", "2014-11-07 23:00", "2015-01-04 21:42",
"201508-11-02 20:15", "201508-11-02 20:15", "2017-11-02"))
TestDataForHelp$ActualDateTime <-
as.POSIXct(TestDataForHelp$OldDateTime, tz="NZ", format="%Y-%m-%d %H:%M")
TestDataForHelp$FixedDateTime <-
ifelse(TestDataForHelp$OldDateTime=="201508-11-02 20:15",
as.POSIXct("2015-11-02 20:15", tz="NZ", format="%Y-%m-%d %H:%M"),
TestDataForHelp$ActualDateTime)
The new variable, FixedDateTime
, does not have a POSIXct type
. It has been implicitly converted to a numeric
type. How can I retain the POSIXct
format from ActualDateTime
and not have the implicit type conversion?
I would like to not have FixedDateTime
but, rather, put the corrected data into ActualDateTime
. The ifelse()
seems to be the part of the code causing the format to shift from POSIXct
to numeric
. If I do:
TestDataForHelp$CopiedDateTime <- TestDataForHelp$ActualDateTime
The new variable, that is simply a copy of the original, retains the POSIXct
type.
The previous question linked in the comments relates to date values only, not date time values. The data manipulation becomes more complicated with dealing with date time values, given that mine also do not include seconds. The other difference is that the original variable contains a mix of date, date-time, and incorrect date-time values, whereas that previous question had values that were all the same. It was unclear whether the non-uniform content of the variable was causing the problem.
Edit: I fixed the problem by fixing the strings before I converted them to dates. This removed the need to try to loop through the dates.
I can replicate the numeric answer, but not explain it. It is however calculating the results correctly for you. I'm not sure why it's returning as a numeric. However, the conversion from numeric to date is easy enough if you know the origin, which should be 1970-01-01. So I believe the following does the trick:
(Note, the first block is just what you already have)