Replace certain colnames in a list of data frames

43 Views Asked by At

I have a list of data frames and I want to replace column names that start with "..." with the name of the column before.

So the starting point is as follows:

df1 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df1

df2 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df2

df3 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df3

df.list<-list(df1, df2, df3)

And I want the columns in the data frames in the list look as follows:

df1 <- data.frame(Tree=c(1:3), Sea=c(4:6), Sea=c(2:4), Beach=c(1:3), Beach=c(2:4))
df1

df2 <- data.frame(Tree=c(1:3), Sea=c(4:6), Sea=c(2:4), Beach=c(1:3), Beach=c(2:4))
df2

df3 <- data.frame(Tree=c(1:3), Sea=c(4:6), Sea=c(2:4), Beach=c(1:3), Beach=c(2:4))
df3

The problem initially arose because I have imported several data frames from Excel in a list where the column names ranged over two columns. I could not manage to label both columns with the same column name when importing the data.

I would really appreciate your help. Thanks!

1

There are 1 best solutions below

1
On BEST ANSWER

I would suggest next approach. You can use a function and lapply() in order to set the change you want. In the function myname() is defined a structure to detect a pattern in the names and then set to NA. After that, we will use zoo function na.locf() in order to complete the names with previous values. Also, sometimes R has issues with duplicated names in a dataframe, that is why I left as a comment a line that can avoid that and you should comment the line that is next of that comment if unique names were needed. Here the code:

library(zoo)
#Data
df1 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df2 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df3 <- data.frame(Tree=c(1:3), Sea=c(4:6), ...3=c(2:4), Beach=c(1:3), ...5=c(2:4))
df.list<-list(df1, df2, df3)
#Remove names
myname <- function(x)
{
  #Names of df
  v1 <- names(x)
  #Detect points
  index <- which(grepl('..',v1,fixed=T))
  v1[index]<-NA
  #Fill
  # v1 <- make.unique(na.locf(v1))
  v1 <- na.locf(v1)
  #Remove
  names(x) <- v1
  #Return
  return(x)
}
#Apply
df.list2 <- lapply(df.list,myname)

Output:

df.list2
[[1]]
  Tree Sea Sea Beach Beach
1    1   4   2     1     2
2    2   5   3     2     3
3    3   6   4     3     4

[[2]]
  Tree Sea Sea Beach Beach
1    1   4   2     1     2
2    2   5   3     2     3
3    3   6   4     3     4

[[3]]
  Tree Sea Sea Beach Beach
1    1   4   2     1     2
2    2   5   3     2     3
3    3   6   4     3     4