I need to add a mean, variance and standard deviation column (per subject), but my data is a little complex:
I have subject ID's, dates & times, weeks of the year, overall attendance count and attendances per week. Now what I need are 3 more columns, giving me the mean visit per week, the variance of attendance and the standard deviation.
To make it more clear, this is a snapshot of my data set:
df <- c(Contact.ID, Date.Time, Week, Attendance, WeeklyAT)
Contact.ID Date Time Week Attendance WeeklyAT *Mean *v *sd
1 A 2012-10-06 18:54:48 40 3 2 *0.214 *0.335 *0.579
2 A 2012-10-08 20:50:18 40 3 2 *0.214 *0.335 *0.579
3 A 2012-11-24 20:18:44 47 3 1 *0.214 *0.335 *0.579
4 B 2012-11-15 16:58:15 46 4 1
5 B 2013-01-09 10:57:02 2 4 3
6 B 2013-01-11 17:31:22 2 4 3
7 B 2013-01-14 18:37:00 2 4 3
8 C 2013-02-22 17:46:07 8 2 1
9 C 2013-02-27 11:21:00 9 2 1
10 D 2012-10-28 14:48:33 43 1 1
To calculate the mean attendance, it needs to be considered, that the timeframe I am looking at is 14 weeks and the weekly attendance is repeated, thus needs to be bound to the week number. So, to calculate subject A and B's mean for example it would have to be:
meanA = (2+1+0+0+0+0+0+0+0+0+0+0+0+0)/14=0.214
meanB = (1+3+0+0+0+0+0+0+0+0+0+0+0+0)/14=0.286
(here the 14 weeks don't matter too much but for the variance and sd it does:
varianceA = ∑(x-µ)^2 = [(2-0.214)^2+(1-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2+(0-0.214)^2]/(14-1) = 4.357/13 = 0.335
sdA= √varianceA = √0.335 = 0.579
I cannot figure out how to do this in code. I have tried ifelse
functions and general var
and mean
and tried to create new columns with these but failed at defining it per subject (Contact.ID) and for my n=14.
I greatly appreciate the help. Many Thanks!
Data
tidyverse solution
Output