Imagine I have the following data:
| ID. | Drug1. | Drug2. | Drug3. | Drug4. |
|---|---|---|---|---|
| 1. | 1. | 0. | 0. | 0. |
| 2. | 0. | 0. | 0. | 1. |
| 3. | 0. | 1. | 0. | 0. |
| 4. | 0. | 0. | 1. | 0. |
| 5. | 1. | 0. | 0. | 0. |
Where ID is the number given to each patient and each Drug variable is a binary variable where 1 indicates that patient had a certain condition on that drug and 0 indicates he/she didn't.
In order to compare the proportion of the rate of condition between drugs, I want to perform chi-sqauare tests like: Drug1 vs Drug2, Drug1 vs Drug3, Drug1 vs Drug4, Drug2 vs Drug3, Drug2 vs Drug4, etc.
How can I do this in R in one line of code? Btw, is it necessary to implement correction for multiple comparisons (e.g., Bonferroni)?
Below is a tidyverse approach using {dplyr}. I first generate some data to run real tests with meaningful results. Then we can use the
colnamesofmydatwithcombnto get all pairs of drugs. Then we can userowwiseandmutateand applychisq.test()to each row. Here we use the strings inV1andV2to subset the variables inmydat. Since we are in adata.framewe have to wrap the result inlistif its a non-atomic vector. We can subsetchisq_testwith$p.valueto get the p values.Created on 2022-04-04 by the reprex package (v2.0.1)
Below is my old answer, which compares the distribution of
0and1within each drug, which is not what the OP asked for, as @KU99 correctly pointed out in the comments.Created on 2022-03-29 by the reprex package (v0.3.0)