How to find nearest week in R

105 Views Asked by At

I want to find the nearest lowest week for each date in df$Upload_Date by referencing a series of weeks called 'weeks'.

head(df$Upload_Date)
#[1] "2014-09-25" "2014-09-25" "2014-09-25" "2014-11-06" "2014-09-25" "2014-09-25"

I also have a list of weeks that I will be using in a report.

> weeks
 [1] "2014-08-01" "2014-08-08" "2014-08-15" "2014-08-22" "2014-08-29" "2014-09-05" "2014-09-12" "2014-09-19" "2014-09-26"
[10] "2014-10-03" "2014-10-10" "2014-10-17" "2014-10-24" "2014-10-31" "2014-11-07" "2014-11-14"

The first value in df$Upload_Date is "2014-09-25. The nearest lower week in "weeks" is "2014-09-19". So I want to create a new column called "df$report_week" which will assign "2014-09-19" for the row with "2014-09-25".

I've tried setting the following variables (not sure if this will be helpful or not):

upload_day <- as.POSIXlt(df$Upload_Date[1])$yday
> upload_day
[1] 267

report_days <- as.POSIXlt(weeks)$yday
> report_days
 [1] 212 219 226 233 240 247 254 261 268 275 282 289 296 303 310 317

Any ideas here?

3

There are 3 best solutions below

0
On

You may try cut and use your "week" vector as breaks:

date <- as.Date(c("2014-09-25", "2014-10-06"))
weeks <- as.Date("2014-08-01") + 7 * 0:10

cut(date, breaks = weeks)
# [1] 2014-09-19 2014-10-03

Also note the 'built-in' breaks "weeks", where weeks start on Monday (default start.on.monday = TRUE), in contrast to your 'start on Friday-weeks':

cut(date, breaks = "week")
# [1] 2014-09-22 2014-10-06
0
On

You can use findInterval. It looks like you are trying to find the week of your upload dates with the weeks being as of the previous Friday.

data.frame(date=Upload_Date,week.of=weeks[findInterval(Upload_Date, weeks)])
        date    week.of
1 2014-09-25 2014-09-19
2 2014-11-06 2014-10-31

0
On

You could use floor_date() from lubridate. E.g., floor_date(x, unit = "week"). The other answers with cut is very nice for the general case where you have that handy vector of weeks (or whatever other time interval). This would work for most standard units, the choices are c("second", "minute", "hour", "day", "week", "month", "year").