foreach / forvalues loop in R

528 Views Asked by At

I am pretty ok with loops in Stata and I am trying to program something similar in R, but I keep getting error messages- any idea why this loop doesnt work?

I have an Excel sheet with data with separate tabs for each month (as in monthlist). I want to import each separate tab and add year and month column to it.

monthlist = list("Jan", "Feb", "Mar", "Apr", "May", "June", "Jul", "Aug", 
"Sep", "Oct", "Nov", "Dec")

for (k in seq_along(monthlist)){
   infile <- paste(i, " GP",".xlsx",sep=""); name<- 
   paste("X",i,"_GP",sep="")
   name.[k]<- read_excel(infile, sheet = [k])
   month=[k]
   name.[k] = cbind(year, month, name.[k])
  }

I have tried and tried researching the correct reference to k values, but I cant figure it out. Please help.

2

There are 2 best solutions below

1
On

If I understand correctly following the comments, I think you want to take data from different sheets across several workbooks, and assign each sheet's data to its own separate dataframe, so that you have many dataframes; and you don't want to do this combining inside the loop because each could be structured differently. If that's the case then you could modify your code to this:

monthlist = list("Jan", "Feb", "Mar", "Apr", "May", "June", "Jul", "Aug", 
"Sep", "Oct", "Nov", "Dec")

for (k in seq_along(monthlist)){
   infile = paste(i, " GP",".xlsx",sep="")
   name = paste("X",i,"_GP",sep="")
   d_temp = read_excel(infile, sheet = k)
   assign(paste0(year, month), d_temp)
  }

The key points there are to drop the square brackets around k when reading in the file, and secondly to use assign() to dynamically create a variable name. d_temp temporarily stores the data frame before assign creates a new distinct variable from it.

I note references to year and i in your code which I assume you have set beforehand, so I'm not addressing those.

0
On

Without your data, it is a little hard to tell you if this is going to work or not, but here are a few options. I would personally create a nested dataframe of dataframes if I wasn't sure that all dataframes had the same number of rows.

library(tidyverse)
library(readxl)

df <- data_frame(year = rep(c(2014, 2015, 2016), each = 12), 
                 month = rep(c("Jan", "Feb", "Mar", "Apr", "May", "June", "Jul", "Aug", 
                              "Sep", "Oct", "Nov", "Dec"), 3)) %>%
  mutate(infile = map_chr(year, ~paste(.x, " GP",".xlsx",sep="")),
         data = map2(infile, month, ~read_excel(.x, sheet = .y)),
         data = map2(data, month, ~mutate(.x, month = .y)),
         data = map2(data, year, ~mutate(.y, year = .y)))

However, if you want to use a loop, you could also do that and I would recommend a list of dataframes in that case.

year <- c(2014:2016)
month = c("Jan", "Feb", "Mar", "Apr", "May", "June", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec") 
df.list <- list()
for(i in seq_along(year)){
  for (k in seq_along(month)){ 
  infile <- paste(year[[i]], " GP",".xlsx",sep="") 
  name<- paste("X",year[[i]],"_GP",sep="") 
  df.list[[i]] <- read_excel(infile, sheet = month[[k]]) 
  }
}