I am relatively new to R and need some help with my data analysis. In the attached table, Master Protein Accession column consists of a list of proteins that are increased or decreased in the cortex(C) under three conditions, i.e., control (C), dehydration(D) and rehydration(R). Each condition has 5 samples; CC(1,2,3,4 and 5), CD(1,2,3,4 and 5) and CR(1,2,3,4 and 5). I need to do a t-test for comparing Cortex Control (CC1,2,3,4 and 5) samples against Cortex Dehydration (CD1,2,3,4 and 5) samples respectively for all the proteins. Such that when I run the code, row 1 CC1 value gets t-tested against row 1 CD 1 value, row 2 CC1 value gets t-tested against row 2 CD 1 value and so on.
I tried
apply(allcor1, function(x){t.test(x[2:12],x[4:14], nchar)})
but it gives me
Error in match.fun(FUN) : argument "FUN" is missing, with no default
The challenge you have is that the data is too "wide": you are representing each protein as one row when it is at least 5 data points.
The problem gets easier if you reshape it. Here I'll use tidyr's pivot functions, as well as extract.
I can't test without a reproducible example, but this should give you a table with columns
protein
,condition
,sample
(1-5) andvalue
).At this point, the data is more flexible to be used for statistical modeling, such as a paired t-test. I use dplyr here to do grouped t-tests of CC against CD, and the broom package to tidy it.
This would give you columns
estimate
,statistic
, andp.value
, among others (confidence intervals, etc) for each protein.