R - subset dataframe by Time only

2.9k Views Asked by At

I have been looking around but I still couldn't find a way to subset my dataframe by time, here is the sample data:

 Duration            End Date          Start Date
      228 2013-01-03 09:10:00 2013-01-03 09:06:00
     1675 2013-01-04 17:34:00 2013-01-04 17:06:00
      393 2013-01-04 17:54:00 2013-01-04 17:48:00
      426 2013-01-04 11:10:00 2013-01-04 11:03:00
      827 2013-01-01 16:13:00 2013-01-01 15:59:00
      780 2013-01-01 16:13:00 2013-01-01 16:00:00

The End Date and Start Date are in POSIXct format, and here is what I have tried if I only what times between 8:00 to 9:30.

tm1 <- as.POSIXct("08:00", format = "%H:%M")
tm2 <- as.POSIXct("09:30", format = "%H:%M")
df.time <- with(df, df[format('Start Date', '%H:%M')>= tm1 & format('End Date', '%H:%M')< tm2, ])

but this returns an error. I have also tried this, but it didn't work as well.

df.time <- subset(df, format('Start Date', '%H:%M') >= '8:00' & format('End Date', '%H:%M') < '9:30'))

if anybody tell me what am I doing wrong? Thanks

1

There are 1 best solutions below

0
On BEST ANSWER

Assuming that the start and end dates are always the same and only the times differ and you want those rows for which the time starts at or after 8:00 and ends before 9:30, convert the date/time values to characters strings of the form HH:MM and compare:

subset(DF, format(`Start Date`, "%H:%M") >= "08:00" & 
           format(`End Date`, "%H:%M") < "09:30")

giving:

  Duration            End Date          Start Date
1      228 2013-01-03 09:10:00 2013-01-03 09:06:00

Note: We used the following for DF. (Next time please use dput to provide your data in reproducible form.)

DF <- structure(list(Duration = c(228L, 1675L, 393L, 426L, 827L, 780L
), `End Date` = structure(c(1357222200, 1357338840, 1357340040, 
1357315800, 1357074780, 1357074780), class = c("POSIXct", "POSIXt"
), tzone = ""), `Start Date` = structure(c(1357221960, 1357337160, 
1357339680, 1357315380, 1357073940, 1357074000), class = c("POSIXct", 
"POSIXt"), tzone = "")), .Names = c("Duration", "End Date", "Start Date"
), row.names = c(NA, -6L), class = "data.frame")