data <- data.frame(
sex = factor(c("M", "F", "M")),
ageid = factor(c(8, 6, 7)),
married = factor(c(2, 1, 2)),
cagv_typ = factor(c("non-primary", "primary", "non-primary")),
sq5_1 = factor(c(1, 1, 1)),
sq5_2 = factor(c(0, 1, 0))
)
Among this dataframe, sex and married are variable, and the rest of them are outcomes. Actually I have more than 10 outcome variables and 5 subgroup variables.
At first, I made the following codes:
chisq_test <- function(data, var1, var2) {
contingency_table <- table(data[[var1]], data[[var2]])
test_result <- chisq.test(contingency_table)
return(test_result)
}
chisq_test(data = sq_catvar, var1 = "sex", var2 = "cagv_typ")
However, I found it still is super time-consuming if I manually input the outcome and variables one by one. Thus, I wonder if there is better approach to do chi-square test with reduced time.
Thank you in advance.
Best wishes
You can use
expand.gridto get all the combinations you are looking for:And we can use
applyto iterate down this data frame and apply yourchisq_testfunction to each combination of variables. This will return a list of 8 chi-square tests:This will easily scale up to five x variables and 10 y variables using the same code.
Please remember that if you are carrying out 50 Chi square tests, the p values will not be valid due to multiple hypothesis testing, and you will need a Bonferroni correction or similar to take account of the fact that you would expect 2 or 3 "significant" results purely by chance with this many significance tests.
Created on 2023-09-12 with reprex v2.0.2