Round date to next weekday in R

2.1k Views Asked by At

I'm currently struggling with some date transformations in R. I have a large dataset of financial data with a date column. Since securities are not traded on the weekend, I need to have only weekdays in my dataset. How can I round the dates in this column to the previous weekday? So each Saturday and Sunday should be transformed to the previous Friday. in the extract below, the the first date is a Saturday and the second a Sunday. Now I would like to transform these into 2007-03-02 and leave the other rows as they are.

# A tibble: 6 x 5
Ticker Date       mean_PX_ASK mean_PX_BID Agency 
<chr>    <date>           <dbl>       <dbl> <chr>  
1 ABNANV 2007-03-03       102.        102.  Moody's
2 ABNANV 2007-03-04       102.        102.  Moody's
3 ABNANV 2007-03-12       102.        102.  Moody's
4 ABNANV 2007-03-12       102.        102.  Moody's
5 ABNANV 2008-09-17        88.9        88.4 Fitch  
6 ABNANV 2008-09-17        88.9        88.4 Fitch  

Glad for any kind of help!

4

There are 4 best solutions below

0
On BEST ANSWER

A simple solution could be using case_when from dplyr to check if weekday for that day is "Saturday" or "Sunday" and subtract the days accordingly.

library(dplyr)

df %>%
  mutate(Day = weekdays(Date), 
         Date = case_when(Day == "Saturday" ~ Date - 1, 
                          Day == "Sunday" ~ Date - 2, 
                          TRUE ~ Date)) %>%
   select(-Day)


#  Ticker       Date mean_PX_ASK mean_PX_BID  Agency
#1 ABNANV 2007-03-02       102.0       102.0 Moody's
#2 ABNANV 2007-03-02       102.0       102.0 Moody's
#3 ABNANV 2007-03-12       102.0       102.0 Moody's
#4 ABNANV 2007-03-12       102.0       102.0 Moody's
#5 ABNANV 2008-09-17        88.9        88.4   Fitch
#6 ABNANV 2008-09-17        88.9        88.4   Fitch

With bizdays we need to create a calendar using create.calendar and default weekdays. We can then use adjust.previous to get the previous working day.

library(bizdays)
cal <- create.calendar("Actual", weekdays=c("saturday", "sunday"))
adjust.previous(df$Date, cal)

#[1] "2007-03-02" "2007-03-02" "2007-03-12" "2007-03-12" "2008-09-17" "2008-09-17"
0
On

It can be done in a single line without any packages or ifelse if we use a named vector

df$Date <- with(df,  Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])
df
#  Ticker       Date mean_PX_ASK mean_PX_BID  Agency
#1 ABNANV 2007-03-02       102.0       102.0 Moody's
#2 ABNANV 2007-03-02       102.0       102.0 Moody's
#3 ABNANV 2007-03-12       102.0       102.0 Moody's
#4 ABNANV 2007-03-12       102.0       102.0 Moody's
#5 ABNANV 2008-09-17        88.9        88.4   Fitch
#6 ABNANV 2008-09-17        88.9        88.4   Fitch

Benchmarks

Using a bigger dataset

df1 <- df[rep(seq_len(nrow(df)), 1e7), ]

system.time({
df1 %>%
  mutate(Day = weekdays(Date), 
         Date = case_when(Day == "Saturday" ~ Date - 1, 
                          Day == "Sunday" ~ Date - 2, 
                          TRUE ~ Date)) %>%
   select(-Day)

})
# user  system elapsed 
# 41.468   6.881  49.588 
system.time({

with(df1,  Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")])

})
# user  system elapsed 
# 27.456   2.785  30.490 

with microbenchmark,

library(microbenchmark)
microbenchmark(
   rs = df1 %>%
         mutate(Day = weekdays(Date), 
         Date = case_when(Day == "Saturday" ~ Date - 1, 
                          Day == "Sunday" ~ Date - 2, 
                          TRUE ~ Date)) %>%
   select(-Day),
ak = with(df1,  Date - setNames(rep(0:2, c(5, 1, 1)), 1:7)[format(Date, "%u")]), 
          times = 10L, unit = "relative")
#Unit: relative
# expr      min       lq     mean   median       uq      max neval cld
#   rs 1.401658 1.437164 1.446403 1.421731 1.512451 1.467511    10   b
#   ak 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000    10  a 

data

df <- structure(list(Ticker = c("ABNANV", "ABNANV", "ABNANV", "ABNANV", 
"ABNANV", "ABNANV"), Date = structure(c(13575, 13576, 13584, 
13584, 14139, 14139), class = "Date"), mean_PX_ASK = c(102, 102, 
102, 102, 88.9, 88.9), mean_PX_BID = c(102, 102, 102, 102, 88.4, 
88.4), Agency = c("Moody's", "Moody's", "Moody's", "Moody's", 
"Fitch", "Fitch")), row.names = c("1", "2", "3", "4", "5", "6"
), class = "data.frame")
0
On

In base R you can use format.Date with format string %u.

dates <- as.Date(c('2007-03-02', '2007-03-03', '2007-03-04'))
wd <- as.integer(format(dates, '%u'))
as.Date(ifelse(wd >= 6, dates + 5 - wd, dates), origin = '1970-01-01')
#[1] "2007-03-02" "2007-03-02" "2007-03-02"
0
On

Using wday from lubridate:

library(lubridate)
# Generate some data
dfdate <- seq.Date(from = as.Date("2019-04-26"), to = as.Date("2019-04-28"), by = "day")

dfdate
[1] "2019-04-26" "2019-04-27" "2019-04-28"

wday starts on a Sunday where wday = 1

# Change all values to a Friday
dfdate[wday(dfdate) == 7] <-  dfdate[wday(dfdate) == 7] - 1 # Saturdays to Fri
dfdate[wday(dfdate) == 1] <-  dfdate[wday(dfdate) == 1] - 2 # Sundays to Fri

dfdate
[1] "2019-04-26" "2019-04-26" "2019-04-26"