Disclaimer: I am going to come out of this looking silly.
I have a data frame containing a column which has a date of class POSIXct
. I am trying to remove some of the rows containing specific dates- public holidays. I tried to do that using this:
> modelset.nonholiday <- modelset[!modelset$date == as.POSIXct("2013-12-31")|
!modelset$date ==as.POSIXct("2013-07-04") |
!modelset$date == as.POSIXct("2014-07-04")|
!modelset$date == as.POSIXct ("2013-11-28") |
!modelset$date == as.POSIXct ("2013-11-29") |
!modelset$date == as.POSIXct ("2013-12-24") |
!modelset$date == as.POSIXct ("2013-12-25") |
!modelset$date == as.POSIXct ("2014-02-14") |
!modelset$date == as.POSIXct ("2014-04-20") |
!modelset$date == as.POSIXct ("2014-05-26"), ]
The above didn't work. It returns the data frame removing only the first So I tried :
modelset[!modelset$date %in% c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26"), ]
This didn't work either. I also tried:
`%notin%` <- function(x,y) !(x %in% y)
modelset[modelset$date %notin% as.POSIXct(c("2013-12-31", "2013-07-04", "2014-07-04",
"2013-11-28", "2013-11-29", "2013-12-24", "2013-12-25", "2014-02-14",
"2014-04-20", "2014-05-26")), ]`
I've referred Remove Rows From Data Frame where a Row match a String, R remove rows containing a certain value, and Standard way to remove multiple elements from a dataframe but can't seem to find what I am doing wrong.
> head(modelset)
date spot.volume.loc spot.volume.nat nat.imp.a loc.imp.a nat.imp.m loc.imp.m branded.leads esi.leads
1 2013-07-01 2988 215 13931 4155.3 5770 1853.7 331 363
2 2013-07-02 3200 218 12589 4651.3 5374 2207.8 293 428
3 2013-07-03 3066 203 10305 3921.0 4754 1759.2 273 325
4 2013-07-04 3153 83 2353 4135.6 999 1912.2 172 184
5 2013-07-05 2959 59 1553 3573.4 815 1662.3 193 246
6 2013-07-06 667 53 2219 456.7 889 214.8 161 203
tv.leads callin.leads total.leads total.imp.a total.imp.m day week quarter on.off
1 195 41 930 18086.3 7623.7 Monday 26 Q3 1.25
2 192 50 963 17240.3 7581.8 Tuesday 26 Q3 1.00
3 149 38 785 14226.0 6513.2 Wednesday 26 Q3 1.00
4 34 0 390 6488.6 2911.2 Thursday 26 Q3 1.00
5 50 18 507 5126.4 2477.3 Friday 26 Q3 0.75
6 14 9 387 2675.7 1103.8 Saturday 26 Q3 0.50
For an answer using
dplyr
and using your%notin%
approach, you also have: