how do I calculate the mean square of all 2019_Preston_STD,2019_Preston_V1,2019_Preston_V2 etc using the Value column, then the adjmth1, adjmth3 columns
structure(list(IDX = c("2019_Preston_STD", "2019_Preston_V1",
"2019_Preston_V2", "2019_Preston_V3", "2019_Preston_W1", "2019_Preston_W2"
), Value = c(3L, 2L, 3L, 2L, 3L, 5L), adjmth1 = c(2.87777777777778,
1.85555555555556, 2.01111111111111, 1.77777777777778, 3.62222222222222,
4.45555555555556), adjmth3 = c(2.9328763348507, 2.08651828334684,
2.80282946626847, 2.15028039284054, 2.68766916156347, 4.51425274916654
), adjmth13 = c(2.81065411262847, 1.82585524933201, 1.81394057737959,
1.40785681078568, 3.30989138378569, 4.7301083495049)), row.names = 29:34, class = "data.frame")
This task can be done in many ways, as shown in the link that @r2evans pointed out. My favorite one is
dplyr
usingsummarize(across()
because to me its syntax is easy to understand and easy to apply to many columns. It also presents the resulted numbers in nice format.For example, from
iris
data I want to get the arithmeticmean
ofSepal.Length
,Petal.Length
, andPetal.Width
for each of species : setosa, versicolor, and virginica. Here is the head of the data:And here is how to get the mean in each species:
As for your task, first you need to define the function for the mean square (because its definition slightly varies in some references). Then, you apply it to your data frame using
summarize(across())
.For example, you define the mean square function as follows:
Note: This definition requires that length(x) doesn't equal 1, or otherwise NaN will be produced.
You can apply it to your data frame
newdata
as follows: