I am using pairwise_wilcox_test from rstatix package on my data.frame.
data1 #shortend the data.frame
Firmicutes Proteobacteria Verrucomicrobiota cls
1 9916885 83115.37 0.0000 1
10 9923240 76759.73 0.0000 1
13 9897778 102222.14 0.0000 1
16 9887923 112077.44 0.0000 1
19 9832122 167423.55 454.1326 1
11 9717375 235007.98 47616.9546 2
14 9820485 150719.87 28794.7347 2
17 9805007 54276.39 140716.5721 2
2 9676859 320811.45 2329.3241 2
20 9636967 363032.82 0.0000 2
12 9581184 400989.93 17825.6204 3
15 9908333 87339.68 4327.6418 3
18 9624107 147003.76 228889.5762 3
21 9899086 67276.26 33638.1295 3
24 9827215 165133.37 7651.6540 3
When I apply it on a specific column, it works fine
WIL <- rstatix::pairwise_wilcox_test(Firmicutes ~ cls, data=data1,exact = TRUE, p.adjust.method="bonferron")
Output:
# A tibble: 3 × 9
.y. group1 group2 n1 n2 statistic p p.adj p.adj.signif
* <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <chr>
1 Firmicutes 1 2 12 12 86 0.443 1 ns
2 Firmicutes 1 3 12 12 71 0.977 1 ns
3 Firmicutes 2 3 12 12 43 0.101 0.303 ns
Now I want to use apply() to parse the entire table as follows (the table is originally longer), but I have a problem with the apply() function
WIL <- apply(as.matrix(data1),2, function(x){rstatix::pairwise_wilcox_test(x ~ cls, data=data1,exact = TRUE, p.adjust.method="bonferron")})
Output:
ℹ In index: 1.
ℹ With name: V1.
Caused by error in `pull()`:
! Can't extract columns that don't exist.
✖ Column `x` doesn't exist.
Run `rlang::last_trace()` to see where the error occurred.
Called from: signal_abort(cnd, .file)
I understand that the column "x" is not present, but I thought that x is defined by fucntion(x).
Can somebody give me a hint what I m doing wrong.
I am fairly new to R and stackoverflow, so maybe there is an obvious solution for this I apologise in advance...
Thank you!
You can't use
applyhere, because thexis the actual vector of values from your data frame, not the name of the column that you wish to test. In any case, the variablexinside the formulax ~ clsdoes not get substituted (this is always the case with formulas in R), so the the function is literally looking for a column calledxthat doesn't exist.Instead, you can use the column
namesof interest, and turn each into a correct formula insidelapply. You can then simply bind the results together into a single data frame:Created on 2023-09-11 with reprex v2.0.2
Data from question in reproducible format