Calculate the dependency standard deviation in R

96 Views Asked by At

I want to calculate the standard deviation in R. But the standard function "sd(x)" is not the function that I need. I'm looking for a function to calculate the sd(x, in dependency of another variable in my dataframe). So that I can add a new column with the sd by the dependency variable (image). Like this:

image   answer    sd
a       1         0,70
a       2         0,70
b       2         2,12
b       5         2,12
2

There are 2 best solutions below

0
On

What I understood is that you want the standard deviation of the answer for each image. You can group your df by image then use sd, which will calculate separetly for each group using dplyr.

df <- data.frame(image = c('a', 'a', 'b', 'b'),
                 answer = c(1, 2, 2, 5))

library(dplyr)    
df %>%
        group_by(image) %>%
        mutate(sd = sd(answer))
4
On

Function ave is perfect for this.

dat <- read.table(text = "
image   answer    sd
a       1         0,70
a       2         0,70
b       2         2,12
b       5         2,12
", header = TRUE, dec = ",")

ave(dat$answer, dat$image, FUN = sd)
#[1] 0.7071068 0.7071068 2.1213203 2.1213203

EDIT.
Following the dialog with Henry in the comments, I have decided to edit the answer. Fortunately so, since that in the mean time I realized that the original dataset uses the comma as a decimal point.
So, first change, to include argument dec = "," in the read.table above.
Second change, to show a complete solution with column sd created by the ave instruction.

dat2 <- dat[-3]  # start with the OP's data without the 3rd column
dat2$sd <- ave(dat2$answer, dat2$image, FUN = sd)
dat2
#  image answer        sd
#1     a      1 0.7071068
#2     a      2 0.7071068
#3     b      2 2.1213203
#4     b      5 2.1213203