I have the dataset like below,and I read it as a csv file and load the dataframe as df
Name Value1 Value1
A 2 5
A 1 5
B 3 4
B 1 4
C 0 3
C 5 3
C 1 3
If I do the following command in R,
out<-ddply(df, .(Name), summarize, Value1=mean(Value1),Value2=mean(Value2))
I am getting an output like this,
Name Value1_mean Value2_mean
A 1.5 5
B 2 4
C 2 3
But need to find the mean for Value2 and Value1 and store the result in a separate column say value1_mean and value2_mean like this for every entry,
Name Value1 Value1 value1_mean value2_mean
A 2 5 1.5 5
A 1 5 1.5 5
B 3 4 2 4
B 1 4 2 4
C 0 3 2 3
C 5 3 2 3
C 1 3 2 3
How can I get this above output?
We can do this efficiently with
data.table. Convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Name', specify the columns to take themeanwith.SDcols, loop through the subset of data.table (.SD), get themeanand assign (:=) it to new columns.Or with
dplyr, we usemutate_eachdata