I have a dataframe:
A | B |
---|---|
0.5 | 69.26 |
1 | 74.91 |
1.5 | 77.21 |
2 | 77.52 |
I run
cor, pval = pearsonr( data["A"], data["B"])
Answer is: correlation coeficient = 0.91 and p-value = 0.09
My significance level alpha = 0.05. How do I interpret such result? Is data correlating?
I am asking because I have found several sources which deny each other. One says p-value should be small and another says high in order to say that correlation exists.
Also, I would like to know the coefficient for which I can say for sure that data correlates. For example if coef. 0.9 is quite high can I assert that it still corelates but with coefficient 0.4 for sure?
In your case: No, there's no significant correlation, because your p value (which is the computed probability for the null hypothesis, namely that the connection between the two variables is the result of random fluctuations rather than a causal link) is 9 % and therefore higher than the chosen significance value of 5 %.
Note that this is not a matter of Scipy; it's a matter of interpreting the results that your statistical analysis has given you.
Also note that it would be methodologically wrong to change the significance threshold to 10 % now. Under that threshold the results would indeed be statistically significant, but proper methodology demands that you set the applicable threshold beforehand and then compare the results against it, not that you adjust afterwards to make your results fit a desired outcome.