automatically calculating the average and st dev for multiples data frames and excluing the first column in each

34 Views Asked by At

I am essentially brand new to R but need to produce some plots for my thesis.

I am working with hyperspectral data for which I have exported csvs containing groups of scans. I have taken hundreds of readings and even when spectra are grouped this produces many separate csv files.

In each csv the x-axis is in the first column and each following column contains intensities for independent spectra. the number of columns varies based on how many scans were taken in that group.

I am trying to create new columns in each dataframe which correspond to the average and st dev.

I would also like to create another dataframe which pulls the new mean and st dev columns out and places them into a new dataframe.

1

There are 1 best solutions below

5
On

Without a reproducible example it's impossible to test the code below but I would do something along its lines.

old_dir <- setwd("path/to/dir")
filenames <- list.files(pattern = "\\.csv")
df_list <- lapply(filenames, read.csv)

df_list <- lapply(df_list, function(DF){
  DF[["Mean"]] <- rowMeans(DF[-1], na.rm = TRUE)
  DF[["StdErr"]] <- apply(DF[-1], 1, sd, na.rm = TRUE)
  DF
})

new_list <- lapply(df_list, function(DF){
  data.frame(Mean = DF[["Mean"]], StdErr = DF[["StdErr"]])
})

setwd(old_dir)

Note that new_list is an object of class "list". You ask for a new data.frame with the means and sd's. If the original files do not have the same number of columns and rows, that is not possible.