I'm looking to calculate the 1st and 3rd quartile of a small data set to determine the outliers:
6000 13500 15000 15000 17948
While the calculation is fairly simple in theory, I find that python uses a different approach than the one I want (and the Excel function Quartile.EXC uses). The difference is that python includes the median in the quartile calculation. So for the 1st quartile python outputs 13500, for the 3rd 15000. What I want is 9750 and 16474. I haven't found an option that would allow me to do that.
I have used several codes to try and come to that solution, my current one for the 1st quartile is q1 = df.NSOT.quantile(0.25, interpolation = 'midpoint')
.
df being the data frame and NSOT the column with the given values.
On https://www.mathwords.com/o/outlier.htm is an example on how to calculate outliers the way I want with the required 1st and 3rd quartiles.
Any suggestions?
Sorry, if anything about this question is not according to regulations. I just created this account and needed to get an answer quick :/
I think this does the trick. When there is an even set of numbers, it should include one of the medium values to calculate the quartiles. ALthough I would have liked to simply include an option, that does this for me.