Recently I met an issue, when I do t.test like below:
library(dplyr)
library(rstatix)
data(ToothGrowth)
ToothGrowth%>%group_by(supp)%>%t_test(len~dose)
supp .y. group1 group2 n1 n2 statistic df p p.adj
* <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
1 OJ len 0.5 1 10 10 -5.05 17.7 0.0000878 0.000176
2 OJ len 0.5 2 10 10 -7.82 14.7 0.00000132 0.00000396
3 OJ len 1 2 10 10 -2.25 15.8 0.039 0.039
4 VC len 0.5 1 10 10 -7.46 17.9 0.000000681 0.00000136
5 VC len 0.5 2 10 10 -10.4 14.3 0.0000000468 0.00000014
6 VC len 1 2 10 10 -5.47 13.6 0.0000916 0.0000916
and you will notice that the adjusted p value were calculated within the "supp" group
I can also calculate the adjusted p value with the p.adjust
function
ToothGrowth%>%group_by(supp)%>%t_test(len~dose)%>%ungroup()%>%mutate(padj = p.adjust(p,method="holm"))%>%as.data.frame()
supp .y. group1 group2 n1 n2 statistic df p p.adj
1 OJ len 0.5 1 10 10 -5.048635 17.69835 8.78e-05 1.76e-04
2 OJ len 0.5 2 10 10 -7.817021 14.66780 1.32e-06 3.96e-06
3 OJ len 1 2 10 10 -2.247761 15.84238 3.90e-02 3.90e-02
4 VC len 0.5 1 10 10 -7.463430 17.86244 6.81e-07 1.36e-06
5 VC len 0.5 2 10 10 -10.387795 14.32712 4.68e-08 1.40e-07
6 VC len 1 2 10 10 -5.469814 13.59996 9.16e-05 9.16e-05
p.adj.signif padj
1 *** 2.634e-04
2 **** 5.280e-06
3 * 3.900e-02
4 **** 3.405e-06
5 **** 2.808e-07
6 **** 2.634e-04
So I am not sure which should be the correct way. I am thinking the first method is the correct one. But maybe someone will help me to answer the question and explain why? Thanks!
I also search some resources and try to have a clear idea but I may need to know more about this.