I am interested in running a equivalence test between two paired samples, namely group1 and group2. I use statsmodels.stats.weightstats.ttost_paired (https://www.statsmodels.org/stable/generated/statsmodels.stats.weightstats.ttost_paired.html) as equivalence test.

The test’s null and alternative hypotheses are:

  • H0: The difference between the means is outside the equivalence interval
  • H1: The difference between the means is inside the equivalence interval

Problem: My problem, given a specific dataset, is that the test turns out non-significant even though the difference between the two tested samples is so small that thest should turn out significant (p < 0.05), given the boundary I chose (plus minus 5% from the difference between both sample means).

I am not sure if my "problem" is related to my code or the actual data. Therefore, I wrote a simple reproducible code below with two samples (group1 and group2). The samples are almost identical as can be seen in the script, and therefore share almost the same mean. The difference between both sample means is md = 0.1428571428571428.

I set the maximum difference to be 10% via the following part of the code:

# Max percent change
percent = (md/100) * 10
# md minus and plus threshold
lower = md - percent
upper = md + percent

As far as my understanding goes, the equivalence test should turn out significant (p < 0.05) for this dummy dataset.

However, the resulting p-value is 0.46180097594939595, hence the H0 is not rejectet.

import statsmodels as sm
from statsmodels.stats import weightstats
import numpy as np

group1 = [0, 1, 2, 3, 4, 5, 7]
group2 = [0, 1, 2, 3, 4, 5, 6]

mean1 = np.mean(group1)
mean2 = np.mean(group2)

md = mean1-mean2

# Max percent change
percent = (md/100) * 10
# md minus and plus threshold
lower = md - percent
upper = md + percent
# Test
test = sm.stats.weightstats.ttost_paired(
    x1=np.array(group1),
    x2=np.array(group2),
    low=lower,
    upp=upper,
    weights=None,
    transform=None)

print(f"md = {md}")
print(test)

Question: I wonder if I am doing something wrong in the code itself. For example, the lower and upper range should correspond to the difference md between group1 and group2, plus/minus 10%, respectively. I cannot understand why the test still turns out non-significant? Is there a mistake in my code?

0

There are 0 best solutions below