how to show the variable name while aggregate in R

378 Views Asked by At

I was wondering is there any parameter setting when I do the aggregates as below , the result will show the origin column names instead the generic "group.1"

data1 <- aggregate(mtcars[1:4], list(mtcars$am, mtcars$gear),mean)
data1
   Group.1 Group.2      mpg      cyl     disp       hp
1       0       3 16.10667 7.466667 326.3000 176.1333
2       0       4 21.05000 5.000000 155.6750 100.7500
3       1       4 26.27500 4.500000 106.6875  83.8750
4       1       5 21.38000 6.000000 202.4800 195.6000

Thank you so much,

by the way , I know the function names(x) in reshape.

2

There are 2 best solutions below

4
On BEST ANSWER

You can try the formula method

aggregate(cbind(mpg,cyl,disp,hp)~am+gear, mtcars, mean)
#  am gear      mpg      cyl     disp       hp
#1  0    3 16.10667 7.466667 326.3000 176.1333
#2  0    4 21.05000 5.000000 155.6750 100.7500
#3  1    4 26.27500 4.500000 106.6875  83.8750
#4  1    5 21.38000 6.000000 202.4800 195.6000

Or rename within the list

aggregate(mtcars[1:4], list(am=mtcars$am, gear=mtcars$gear),mean)
#   am gear      mpg      cyl     disp       hp
#1  0    3 16.10667 7.466667 326.3000 176.1333
#2  0    4 21.05000 5.000000 155.6750 100.7500
#3  1    4 26.27500 4.500000 106.6875  83.8750
#4  1    5 21.38000 6.000000 202.4800 195.6000

If there are many names, then use setNames

 aggregate(mtcars[1:4], setNames(list(mtcars$am, mtcars$gear), 
                  names(mtcars)[9:10]),mean)

If you decide to use dplyr/data.table/sqldf the equivalent codes are

  library(dplyr)
  mtcars %>%
       group_by(am, gear) %>%
       summarise_each(funs(mean), 1:4)

Using data.table

  library(data.table)#v1.9.5+
  as.data.table(mtcars)[, lapply(.SD, mean), by=.(am, gear), .SDcols=1:4]

Using sqldf

  library(sqldf)
  nm1 <- toString(sprintf("avg(%s) as %s", 
                  names(mtcars)[1:4], names(mtcars)[1:4]))
  fn$sqldf("select am, gear, $nm1 from mtcars group by am, gear")
0
On

Since a data frame is also a list use a data frame for the second argument:

aggregate(mtcars[1:4], mtcars[c("am", "gear")], mean)

giving:

  am gear      mpg      cyl     disp       hp
1  0    3 16.10667 7.466667 326.3000 176.1333
2  0    4 21.05000 5.000000 155.6750 100.7500
3  1    4 26.27500 4.500000 106.6875  83.8750
4  1    5 21.38000 6.000000 202.4800 195.6000