I'm using the pingouin library in python which in-turn uses scipy.stats for it's implementation of Mann-Whitney U tests.
Looking at the example code we see two independent data sets x and y where the distribution sorting x is less than y
My question is: Why is the p-value of MWU with the alternative hypothesis as 'less' half of that as the 'two-sided' alternative hypothesis. I am seeing this in my use case as well.
My confusion: if the alternative hypothesis is two-sided Dist(x) =/= Dist(x)
then H0 is Dist(x) = Dist(y)
. So far so good with that and the p-value tells me there is a 0.5% chance of H0 being true. cool.
If I run MWU again with the alternative hypothesis being 'less', then that is Dist(x) < Dist(y)
. So, I'd imagine the null hypothesis for that H0' would be Dist(x) >= Dist(y)
which would be the same as Dist(X) > Dist(y) or Dist(X) = Dist(y)
.
The result tells me there is a 0.2% chance of H0' being true. How can the probability of H0 be less than the probability of H0' when H0' is the disjunction of H0 and something else?
I looked at the scipy.stats documentation, but the flip of the > and < signs in the i.e. part of the docs confused me.
Does this mean the null hypothesis for 'less' or 'greater' alternative hypotheses are not including the equality part? (That would actually explain it but I don't know if that is the case). I don't know if the i.e. text in this documentation is actually a typo because I thought F and G were for the alternative hypothesis, not the null hypothesis, and would then need to be flipped.
I believe I understand the theory underlying MWU. I guess this is more of a documentation question on this particular function in case anyone else has used it. Looking at the source did not help me.
I misunderstood p-values, again, and so:
In the two-sided test, H0 is
Dist(X) = Dist(Y)
and the p-value tells me P(X,Y or more extreme data | H0) = 0.5% (I had the dependency backwards)In the one-sided test, H0' is
Dist(X) = Dist(Y) or Dist(X) > Dist(Y)
and the p-value tells me P(X,Y or more extreme data| H0') = 0.2%. In plain-language, because we have expanded the hypothesis to cover more values, the probability of my data occurring is smaller since it does not fit that hypothesis, so we can reject it.