When I want to perform a function over one or multiple columns inside a data.table while modifying the columns I group by with a function in the same call, the resulting data.table always shows the applied function as the new column name.
Code example:
library(data.table)
dt <- data.table(value=rnorm(100), class=sample(1:3, 100, replace=TRUE))
dt[, .(class_mean=mean(value)), by=factor(class)]
Output:
factor class_mean
1: 2 0.007297291
2: 3 -0.122847460
3: 1 0.103293676
What I originally would expect is that I get the original column name in the result, like this:
class class_mean
1: 2 0.007297291
2: 3 -0.122847460
3: 1 0.103293676
As far as I can judge, this is happening regardless of which function is applied to the grouping column(s). When performing grouped modifications on a data.table with a column name stored in another variable I usually use by=get(variable_that_stores_the_column_name), also resulting in the modified data.table showing get as the new column name.
How can I modify my data.table grouping call to get the result I want without tediously renaming the column names of the result again?
EDIT:
Thanks for the responses and answers in the comments. This works for the most cases. However, if I would like to address the grouping variable by name via another variable (and want to keep that variable name in the result), the same problem arises:
var_name <- "class"
dt[, .(class_mean=mean(value)), by=.(var_name = factor(get(var_name)))]
names the resulting column var_name. And
var_name <- "class"
dt[, .(class_mean=mean(value)), by=.(get(var_name) = factor(get(var_name)))]
leads to an error:
Error: unexpected '=' in "dt[, .(class_mean=mean(value)), by=.(get(var_name) ="
for your edited question,
results in the desired output