Why does R colnames ignore the vector with specified names and only pay attention to the index?

58 Views Asked by At

I came across some weird behavior in R. Imagine that I have this data frame, and I want to change the names of the columns by using colnames and a vector:

df <- data.frame(a = seq(1,10), b = seq(11,20), c = seq(31,40))
colnames(df) <- c(`b` = 'B1', `a` = 'A2', `c` = 'C3')

This gives the following df:

  B1 A2 C3
1   1 11 31
2   2 12 32
3   3 13 33
4   4 14 34
5   5 15 35
6   6 16 36
7   7 17 37
8   8 18 38
9   9 19 39
10 10 20 40

Notice that the data is still in the same order. So, a is actually replaced with 'B1' and b is replaced with 'A2'. But, it should look like this if colnames paid attention to the name specification:

   A2 B1 C3
1   1 11 31
2   2 12 32
3   3 13 33
4   4 14 34
5   5 15 35
6   6 16 36
7   7 17 37
8   8 18 38
9   9 19 39
10 10 20 40

It seems that colnames is completely ignoring the name specification.

I understand the simple solution is to just order the names how you want them saved. However, I want to know if there is a way to specify them in a vector and for that vector to match the old name to the new name. This is just a simple example, what I actually have is more complicated.

What I want to do is ensure that whenever I am renaming column names, they are replaced in the appropriate way so that when I combine dataframes the data under the column name is matched as intended.

4

There are 4 best solutions below

1
Clemsang On BEST ANSWER

You can use match function in base R:

df <- data.frame(a = seq(1,10), b = seq(11,20), c = seq(31,40))
new_names <- c(`b` = 'B1', `a` = 'A2', `c` = 'C3')
colnames(df)[match(names(new_names), colnames(df))] <- new_names
0
Julian On

You could use relocate for that:

df |> 
  dplyr::relocate(B1 = b, 
                  A2 = a,
                  C3 = c)

or

df |> 
  dplyr::relocate(c("B1" = "b", "A2" = "a", "C3" = "c"))
1
DaveArmstrong On

You could write a function that would do this for you using match().

df <- data.frame(a = seq(1,10), b = seq(11,20), c = seq(31,40))

mycolnames <- function(data, newnames, ...){
  oldnames <- colnames(data)
  nn <- newnames[match(oldnames, names(newnames))]
  nn <- ifelse(is.na(nn), oldnames, nn)
  colnames(data) <- nn
  data
}

df <- mycolnames(df, c(`b` = 'B1', `a` = 'A2', `c` = 'C3'))
head(df, n=2)
#>   A2 B1 C3
#> 1  1 11 31
#> 2  2 12 32

The benefit of this function is that if you have variables that are not renamed, it will simply use the old names

df <- data.frame(a = seq(1,10), b = seq(11,20), c = seq(31,40))
df <- mycolnames(df, c(`b` = 'B1', `c` = 'C3'))
head(df, n=2)
#>   a B1 C3
#> 1 1 11 31
#> 2 2 12 32

Created on 2024-01-19 with reprex v2.0.2

0
Ronak Shah On

If you replace the values as newname = oldname then you can use rename from dplyr.

library(dplyr)
vec <- c('B1' = "b", 'A2' = "a", 'C3' = "c")

df %>% rename(any_of(vec))

   A2 B1 C3
1   1 11 31
2   2 12 32
3   3 13 33
4   4 14 34
5   5 15 35
6   6 16 36
7   7 17 37
8   8 18 38
9   9 19 39
10 10 20 40