I have a data frame that looks like this:
w<-read.table(header=TRUE, text="
start.date end.date manager
2006-05-01 2007-04-30 a
2006-09-30 2007-12-31 b
1999-09-30 2007-12-31 c
2008-01-01 2012-04-30 d
2008-01-01 2020-02-28 e
2009-05-01 2016-04-08 f")
I'd like to obtain a dataframe which returns which managers were working during each month in the period, so for example
df<-read.table(header=TRUE, text="
month manager1 manager2 manager3 manager4
01-2006 a b c NA
02-2006 a b c d
03-2006 b c d NA
04-2006 b d NA NA")
I started by defining a function datseq that returns the months between the start.date and the end.date
datseq <- function(t1, t2) {
format(seq.Date(from = as.Date(t1,"%Y-%m-%d"),
to = as.Date(t2,"%Y-%m-%d"),by="month"),
"%m/%Y")
but then I cannot create a proper loop to obtain the desired result. Thank you in advance to everyone replying!
Since you need to know only the overlap at the month level and not the day level, you can consider that managers have started at day 1 and left at last day of month. This can be achieved using
floor_dateandceiling_datefrom the packagelubridate.You can then use
a %within% bfrom the packagelubridatewhich can check if a date falls within a list of intervals. Apply this function to each of your months with the intervals you provided.Raw data: