Changing variable name for multiple csv file in loop R

42 Views Asked by At

I am new to R so I am confused with the data structure. Basically I have 10 csv files, from 2011-2021, named kbr_year.csv respectively. This csv file contains dataframes, of which I only want to retrieve data from a column of the dataframe. And with that I want to use the loop, so that I can read all of the files, and then appending it as a character. However, I ran into problem, as the code does not execute the loop.

Here is how I'd read the code one by one:

text2021 <- read.csv2("kbr_2021.csv", header=TRUE, sep=";")
text2021_2 <- text2021[!duplicated(text2021$Full.context), ]
text2021_chr <- as.character(text2021_2["Full.context"])

And here is how I execute the loop, using the paste function to make sure that the file name changes as the loop execute:

for (i in 2011:2021)
{
  k <- c()
  lk <- paste("kbr_", i, sep="")
  lk2 <- paste(lk, ".csv", sep="")
  t <- paste("text", i, sep="")
  t <- read.csv2(lk2, header=TRUE, sep=";")
  t <- t[!duplicated(t$Full.context), ]
  k <- as.character(t["Full.context"])
}

However, I only get 1 file read, and that is the file from 2021.

How can I make sure that the loop execute from 2011-2021?

Thanks

I tried to use the paste() function and execute it on a loop

for (i in 2011:2021)
{
  k <- c()
  lk <- paste("kbr_", i, sep="")
  lk2 <- paste(lk, ".csv", sep="")
  t <- paste("text", i, sep="")
  t <- read.csv2(lk2, header=TRUE, sep=";")
  t <- t[!duplicated(t$Full.context), ]
  k <- as.character(t["Full.context"])
}

But it did not work

2

There are 2 best solutions below

0
stefan_aus_hannover On BEST ANSWER

Your loop would work better like this as you're overwriting k at the beginning of each iteration of the for loop

kList <- c()
fileList <- list.files(path=paste("kbr_", i, ".csv", sep=""),pattern='.csv')
for (file in fileList)
{
  dfT <- read.csv2(file)
  dfT <- dfT[!duplicated(dfT$Full.context), ]
  kList <- c(kList,dfT$Full.context)
}
1
r2evans On

My guess.

First, some reproducible sample data. (No seed required, just generating stuff.)

# make fake data
for (yr in 2011:2021) {
  dat <- data.frame(Full.context = yr + sample(10), ignored = runif(10))
  write.csv2(dat, sprintf("kbr_%d.csv", yr))
}

Now that we have a bunch of .csv files, we can read them in like this:

list.files(pattern="kbr.*\\.csv$", full.names=TRUE) |>
  lapply(function(fn) {
    yr <- sub(".*_([0-9]+)\\.csv", "year_\\1", fn)
    x <- read.csv2(fn)[, "Full.context", drop=FALSE]
    names(x) <- yr
    x
  }) |>
  do.call(cbind, args = _)
#    year_2011 year_2012 year_2013 year_2014 year_2015 year_2016 year_2017 year_2018 year_2019 year_2020 year_2021
# 1       2021      2017      2020      2018      2022      2018      2020      2020      2021      2027      2024
# 2       2013      2014      2018      2015      2020      2025      2023      2027      2025      2029      2025
# 3       2020      2013      2016      2019      2021      2022      2021      2024      2029      2025      2022
# 4       2015      2015      2015      2016      2017      2026      2026      2026      2020      2028      2031
# 5       2014      2021      2023      2022      2024      2021      2024      2025      2023      2024      2026
# 6       2018      2022      2017      2017      2019      2024      2022      2022      2028      2021      2030
# 7       2017      2018      2022      2024      2025      2017      2025      2028      2026      2030      2028
# 8       2016      2016      2021      2021      2023      2020      2027      2021      2022      2023      2023
# 9       2019      2020      2014      2020      2018      2023      2019      2023      2024      2022      2027
# 10      2012      2019      2019      2023      2016      2019      2018      2019      2027      2026      2029

Notes: