I have one csv that contains 2 columns, one is weekly dates and the other one is daily dates for each day of every week.
In another csv I have weekly data and I want to make a copy of the csv, with daily data and duplicating the same value for each day of that week, getting something like the below:
OBS DailyDate column1 column2
1 2008-01-07 2008-01-07 5 5
2 2008-01-07 2008-01-08 5 5
3 2008-01-07 2008-01-09 5 5
4 2008-01-07 2008-01-10 5 5
5 2008-01-07 2008-01-11 5 5
6 2008-01-07 2008-01-12 5 5
7 2008-01-07 2008-01-13 5 5
8 2008-01-14 2008-01-14 20 20
9 2008-01-14 2008-01-15 20 20
10 2008-01-14 2008-01-16 20 20
11 2008-01-14 2008-01-17 20 20
12 2008-01-14 2008-01-18 20 20
13 2008-01-14 2008-01-19 20 20
14 2008-01-14 2008-01-20 20 20
The problem I am getting is that when trying to do so with a left_join, I get a dataframe with all NA's, instead of each value duplicated for everyday of the week.
I believe the problem is something about the date formats (for other csv's it is all working as expected).
To solve that, I've tried converting to date formats with different functions but even if the date columns seem to be in the same date formats after reading the csv, the join fails.
str(dates_master)
'data.frame': 4053 obs. of 2 variables:
$ OBS : Date, format: "2008-01-07" "2008-01-07" "2008-01-07" "2008-01-07" ...
$ DailyDate: Date, format: "2008-01-07" "2008-01-08" "2008-01-09" "2008-01-10" ...
str(data_test)
'data.frame': 164 obs. of 3 variables:
$ OBS : Date, format: "2015-12-21" "2015-12-28" "2016-01-04" "2016-01-11" ...
$ column1: num 323 974 528 375 351 ...
$ column2: num 240 660 412 316 300 ...
Therefore, I thought that if I saved the dates in the csv's with a non-date format (e.g. 42464) and then I use the janitor package to convert to date it would solve the issue, but it doesn't.
The code I am using is this one:
data_test <- read.csv("C:/Users/me/Documents/monthly_and_daily_data/test.csv")
data_test$OBS <- janitor::excel_numeric_to_date(data_test$OBS)
dates_master <- read.csv("C:/Users/me/Documents/monthly_and_daily_data/dates_master.csv")
dates_master$OBS <- janitor::excel_numeric_to_date(dates_master$OBS)
dates_master$DailyDate <- janitor::excel_numeric_to_date(dates_master$DailyDate)
data_test_daily <- dates_master %>%
left_join(data_test, by = "OBS", relationship = "many-to-many") %>%
# left_join(data_test, by = "OBS", suffix = c(".daily", ".monthly")) %>%
# left_join(data_test, by = c("OBS" = "OBS"), suffix = c(".daily", ".monthly")) %>%
select(-OBS) %>%
rename(OBS = DailyDate)
What am I missing?
EDIT As suggested in the comments, please find below and example of both of my datasets:
> dput(head(dates_master))
structure(list(OBS = c(39454L, 39454L, 39454L, 39454L, 39454L,
39454L), DailyDate = 39454:39459), row.names = c(NA, 6L), class =
"data.frame")
> dput(head(data_test))
structure(list(OBS = c(39454L, 39454L, 39454L, 39454L, 39454L,
39454L), column1 = c(323.1428571, 974.3571429, 527.9285714, 375.0714286,
351.3571429, 359.2142857), column2 = c(239.9880952, 660.202381,
412.047619, 315.744186, 300.2325581, 280.3837209)), row.names = c(NA,
6L), class = "data.frame")