solving the "Error in file(file, "rt") : cannot open the connection" in my R coding

303 Views Asked by At

I have many CSV files in 10 folders and I need to merge all CSV files existed in every folder into one dataframe. Let's say at the end I will have 10 data frame each is output of merging several CSV files in related folder. to do this and reduce manual work like trying the following code (which works perfectly) for each folder seperately:

setwd("C:/Users/test/2011")
path <- file.path(getwd())

v.filename <- list.files(path, pattern="\\.(csv|txt)$", 
                         ignore.case = TRUE, 
                         full.names = FALSE)

df.all = do.call(rbind, lapply(v.filename, 
                               function(x) read.csv(x)))

I prefer to do it in a for loop. I wrote the following code as in every loop the content of each folder will be read and a data frame will be created and saved and will continue to next loop but while running I receive the "Error in file(file, "rt") : cannot open the connection". Note: the run goes well until before the df.all = do.call(rbind, lapply(v.filename, function(x)read.csv(x))) so most probable the problem is in this line. I would appreciate if anybody could comment on how I can solve the problem.

setwd("C:/Users/test")

path <- file.path(getwd())

years <- as.vector(dir(path))

for (i in 1:length(years)) {
  path <- file.path(paste0("C:/Users/test/", years[i]))
  
  v.filename <- list.files(path, pattern="\\.(csv|txt)$", 
                           ignore.case = TRUE,
                           full.names = FALSE)
  
  df.all = do.call(rbind, lapply(v.filename, function(x)read.csv(x)))
  
  save.image(file=paste0("df.all_", years[i]))
}
1

There are 1 best solutions below

0
On

You could add recursive = T and list all the csv-path in all the subfolders of path (this only works if there are only the csv-files you want to read in) and then use map_dfr, e.g.

library(purrr)
v.filename <- list.files(path, pattern="\\.(csv|txt)$", 
                         ignore.case = TRUE, 
                         recursive = T,
                         full.names = T)


map_df(v.filename, read.csv)