Remove attributes from dataframe

4.9k Views Asked by At

I have the following dataframe (converted from a tax_table object from the phyloseq package).

How can i remove the attributes ??

 str(DT2_mat)
'data.frame':   5120 obs. of  7 variables:
 $ : Factor w/ 2 levels "Archaea","Bacteria": 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 28 levels "Acidobacteria",..: 2 2 2 2 2 2 2 2 2 2 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 60 levels "Acidimicrobiia",..: 3 3 3 3 3 3 3 3 3 3 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 108 levels "Acholeplasmatales",..: 29 29 29 29 29 29 29 29 29 29 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 216 levels "0319-6A21","0319-6G20",..: 58 58 58 58 58 58 58 58 58 58 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 699 levels "Abiotrophia",..: 173 173 173 173 173 173 173 173 173 173 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
 $ : Factor w/ 4964 levels "Abiotrophia defectiva Score:0.87",..: 1613 1529 1449 1448 1565 1438 1563 1532 1623 1605 ...
  ..- attr(*, "names")= chr  "P11_16513" "P193_8942" "P187_9526" "P11_4543" ...
P
3

There are 3 best solutions below

0
On

Actually droping levels removed all the attributes.

> str(droplevels.data.frame(DT2_mat))
'data.frame':   5120 obs. of  7 variables:
 $ : Factor w/ 2 levels "Archaea","Bacteria": 2 2 2 2 2 2 2 2 2 2 ...
 $ : Factor w/ 28 levels "Acidobacteria",..: 2 2 2 2 2 2 2 2 2 2 ...
 $ : Factor w/ 60 levels "Acidimicrobiia",..: 3 3 3 3 3 3 3 3 3 3 ...
 $ : Factor w/ 108 levels "Acholeplasmatales",..: 29 29 29 29 29 29 29 29 29 29 ...
 $ : Factor w/ 216 levels "0319-6A21","0319-6G20",..: 58 58 58 58 58 58 58 58 58 58 ...
 $ : Factor w/ 699 levels "Abiotrophia",..: 173 173 173 173 173 173 173 173 173 173 ...
 $ : Factor w/ 4964 levels "Abiotrophia defectiva Score:0.87",..: 1613 1529 1449 1448 1565 1438 1563 1532 1623 1605 ...
0
On

I just had this issue and solved it by using the data.frame function on the old dataframe. Though this method will remove all the attributes.

my_dataframe <- iris

attr(my_dataframe, "test") <- 1:10

str(my_dataframe) # See the attr
#> 'data.frame':    150 obs. of  5 variables:
#>  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  - attr(*, "test")= int [1:10] 1 2 3 4 5 6 7 8 9 10

my_dataframe |>  #The attr is gone
  data.frame() |> 
  str()
#> 'data.frame':    150 obs. of  5 variables:
#>  $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
#>  $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
#>  $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
#>  $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
#>  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 ...
0
On

In general, you can remove attributes with the attr function by specifying the attribute you want to remove and setting it to NULL.

Suppose you get the following:

> str(my_df)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   107 obs. of  3 variables:
 $ case_no      : chr  "stuff" "more stuff" "other stuff" "residual stuff" ...
 $ region       : chr  "01" "02" "03" "04" ...
 $ petition     : chr  "RC" "RD" "RM" "RC" ...
 - attr(*, "label")= chr "NLRB7799"

You can remove the label with attr(my_df, "label") <- NULL

And get rid of unneeded extra classes by specifying the one you want with attr(my_df, "class") <- "data.frame"

This has worked well for me. Attributes tag along many times when importing data from other software like SAS or Stata. I dislike them because they give me trouble when merging or binding to other dataframes. Hopefully others will find this method useful.