I have a range of dates:
date_rng <- seq( as.Date("2008-01-01"), as.Date("2008-12-31"), by="+1 day")
I have some helper functions that are necessarily relevant to the question and I'll try to leave them out.
I start with the first date and make a call to this function:
# Function for getting weather table by airport code and date and return dataframe
get_table <- function(code, date){
adv <- sprintf(
"https://www.wunderground.com/history/airport/K%s/2008/%s/%s/DailyHistory.html",
code, month(date), day(date)
)
h <- adv %>% read_html()
t <- h%>%
html_nodes(xpath = '//*[@id="obsTable"]') %>%
html_table()
df <- data.frame(t)
return(df)
}
atl_weather <- get_table("ATL", date_rng[1])
Now I iterate over a the rest of the dates creating a new df for each one which I then try to append to the original:
# Loop through remaining dates and bind dfs
for(d in as.list(date_rng[2:4])){
rbind(atl_weather, get_table("ATL", d), d)
}
But the binding doesn't happen and I'm left with the original dataframe for the first date in the range, created before the for loop.
This works though:
atl_weather <- get_table("ATL", date_rng[1])
new_df <- get_table("ATL", date_rng[2])
new_df <- scraped_data_formatter(new_df, date_rng[2])
rbind(atl_weather, new_df)
How can I get rbind() to work in the for loop (so that I iteratively build up the dataframe to include all the data from the full date range)?
It does work. The problem is you are throwing away the result because you don't assign the output from
rbind()
to anything.Change
to this
assuming
atl_weather
is the data frame you want to incrementally add to.That said, you don't want to do this in R; each time you add a column/row to an object R needs to do lots of copying of data around. Basically there's a lot of overhead in incrementally growing objects this way and doing this is a sure fire way to bog your code down.
Ideally, you'd allocate enough space first (i.e. enough rows so that you could index the
i
th row in the when you assign:new_atl_weather[i, ] <- c(....)
.)