My data is currently in two separate columns

T1 example looks like 2020-05-01 09:00:00 PM

T2 example took like 2020-05-02 04:00:00 AM

This data also contains NAs

Data mode and class is characters

I would like to know the difference between these in hours.

I tried a function like so

df$TimeElapsedhours <- NA
for (i in 1:nrow(df)) {
  if (!is.na(df$T1[i])) {
    df$TimeElapsedDays[i] <- difftime(df$T2[i], df$T1[i], units = c("hours"))
  }
}

but the values dont seem right it is giving me absurd answers higher than what is right something is going on.

I tried something like this

df$DIF <- as.numeric(difftime(strptime(df$T1 , "%I:%M %p" ),strptime(df$T2 , "%I:%M %p" ),units='hours'))

and it returns all NAs

So... how can i find the time difference between two time points with data that includes NAs

1

There are 1 best solutions below

2
On

We don't need a for loop here, we can use lubricate to parse the date time, and mutate from dplyr.

library(dplyr)
library(lubridate)

df <- data.frame(T1 = parse_date_time(c("2020-05-01 09:00:00 PM",
                                        "2020-05-01 10:48:00 PM"), 
                                      '%Y-%m-%d %I:%M:%S %p'),
                 T2 = parse_date_time(c("2020-05-02 04:00:00 AM",
                                        "2020-05-02 05:45:00 AM"), 
                                      '%Y-%m-%d %I:%M:%S %p'))

df <- df %>% mutate(TimeElapsedDays = difftime(df$T2, df$T1, units = c("hours")))

This gives us

> df
                   T1                  T2 TimeElapsedDays
1 2020-05-01 21:00:00 2020-05-02 04:00:00      7.00 hours
2 2020-05-01 22:48:00 2020-05-02 05:45:00      6.95 hours

With NAs this will give an NA in the TimeElapsedDays col as below

df1 <- data.frame(T1 = parse_date_time(c("2020-05-01 09:00:00 PM",NA), '%Y-%m-%d %I:%M:%S %p'),
                 T2 = parse_date_time(c("2020-05-02 04:00:00 AM","2020-05-02 04:00:00 AM"), '%Y-%m-%d %I:%M:%S %p'))

df2 <- df1 %>% mutate(TimeElapsedDays = difftime(df1$T2, df1$T1, units = c("hours")))

> df2
                   T1                  T2 TimeElapsedDays
1 2020-05-01 21:00:00 2020-05-02 04:00:00         7 hours
2                <NA> 2020-05-02 04:00:00        NA hours

But we can remove that

df2 <- na.omit(df2)
df2
> df2
                   T1                  T2 TimeElapsedDays
1 2020-05-01 21:00:00 2020-05-02 04:00:00         7 hours