Estimate the needed sample size for a Chi Squared test

2.4k Views Asked by At

I want to estimate the needed sample size to compute a Chi Squared (Test for homogenity) test for discrete data using Python and need a hint how to do it.

In general I want to estimate if the failure rates of two production processes differ significantly (alpha = 5%) or not.

I have only found the statsmodels.stats.gof.chisquare_effectsize() function but this seems to work only for a goodness of fit test.

Is there any way how I can determine the needed sample size?

I appreciate every answer.

1

There are 1 best solutions below

4
On BEST ANSWER

You can use statsmodels.stats.GofChisquarePower().solve_power() However, you need to adjust the degrees of freedom (df) to account for the number of variables. You can accomplish this with the n_bins parameter.

>>>import statsmodels.stats.power as smp
>>>n_levels_variable_a = 2
>>>n_levels_variable_b = 3
>>>smp.GofChisquarePower().solve_power(0.346, power=.8, n_bins=(n_levels_variable_a-1)*(n_levels_variable_b-1), alpha=0.05)

115.94688728433769