I have a question regarding calculating means for each subj.
I have a dataframe as follows:
subj entropy n_gambles trial response rt
1 0 high 2 0 sample 4205
2 0 high 2 0 sample 676
3 0 high 2 0 skip 0
4 0 high 2 1 sample 883
5 0 high 2 1 sample 697
6 0 high 2 1 skip 0
7 0 high 2 2 sample 1493
8 0 high 2 2 sample 507
9 0 high 2 2 skip 0
10 0 high 2 3 sample 1016
and I want to work out the means of sampling for each subj.
I have worked it down to here but I don't know what code next.
Note: the proportion of sampling for each subj are different.
subj trial n_gambles entropy response n_sample
2497 0 0 2 high sample 2
2498 1 0 2 high sample 0
2499 2 0 2 high sample 0
2500 3 0 2 high sample 0
2501 4 0 2 high sample 27
2502 5 0 2 high sample 0
2503 6 0 2 high sample 0
2504 7 0 2 high sample 0
2505 8 0 2 high sample 19
2506 9 0 2 high sample 0
2507 10 0 2 high sample 0
Below are the codes I've for so far.
rm(list=ls())
# Import 'sub.csv' data file into a dataframe
data_subj <- read.csv ('subj.csv')
head (data_subj)
# Import 'response.csv' data file into a dataframe
data_response <- read.csv ('response.csv')
head(data_response)
# Merge 'response' and 'trial'
data <- merge (data_subj, data_response, by='subj')
head(data)
data <- as.data.frame(table(data$subj, data$trial, data$n_gambles, data$entropy, data$response))
colnames(data) <- c('subj', 'trial', 'n_gambles', 'entropy', 'response', 'n_sample')
# Subset for "sample"
data <- data[ data$response == "sample",]
head(data)
Could someone please help me out?
I'd expect the output to look something like this:
subj trial n_gambles entropy response n_sample mean_sample/trials
0 0 2 high sample 2
1 0 2 high sample 0
2 0 2 high sample 0
3 0 2 high sample 0
This is similar to the answer of your earlier question: