I have a data frame that I would like to aggregate by adding certain values. Say I have six clusters. I then feed data from each cluster into some function that generates a value x which is then put into the output data frame.
cluster year lambda v e x
1 1 1 -0.12160997 -0.31105287 -0.253391178 15
2 1 2 -0.12160997 -1.06313732 -0.300349972 10
3 1 3 -0.12160997 -0.06704185 0.754397069 40
4 2 1 -0.07378295 -0.31105287 -1.331764904 4
5 2 2 -0.07378295 -1.06313732 0.279413039 19
6 2 3 -0.07378295 -0.06704185 -0.004581941 23
7 3 1 -0.02809310 -0.31105287 0.239647063 28
8 3 2 -0.02809310 -1.06313732 1.284568047 38
9 3 3 -0.02809310 -0.06704185 -0.294881283 18
10 4 1 0.33479251 -0.31105287 -0.480496125 15
11 4 2 0.33479251 -1.06313732 -0.380251626 12
12 4 3 0.33479251 -0.06704185 -0.078851036 34
13 5 1 0.27953088 -0.31105287 1.435456851 100
14 5 2 0.27953088 -1.06313732 -0.795435607 0
15 5 3 0.27953088 -0.06704185 -0.166848530 0
16 6 1 0.29409366 -0.31105287 0.126647655 44
17 6 2 0.29409366 -1.06313732 0.162961658 18
18 6 3 0.29409366 -0.06704185 -0.812316265 13
To aggregate, I then add up the x value for cluster 1 across all three years with seroconv.cluster1=sum(data.all[c(1:3),6])
and repeat for each cluster.
Every time I change the number of clusters right now I have to manually change the addition of the x's. I would like to be able to say n.vec <- seq(6, 12, by=2)
and feed n.vec into the functions and get x and have R add up the x values for each cluster every time with the number of clusters changing. So it would do 6 clusters and add up all the x's per cluster. Then 8 and add up the x's and so on.
To get the sum of
x
for each cluster as a vector, you can usetapply
:If you instead wanted to output as a data frame, you could use
aggregate
: