I have two sets of datasets (https://github.com/badal01/precipi_data/blob/main/Data1.xlsx). The range of the two datasets is more or less similar. The alpha value for the first dataset (first row in Data1.xlsx) is 5.002. However, the alpha value for the second row is 396111157.0657891, which is abnormal. I attempted to check if the data contains any absurd values, very high values, or negative values. However, the data looks completely normal and very similar to the first row.
import scipy.stats as stats
import pandas as pd
io = pd.read_excel('Data1.xlsx',header=None).to_numpy()
fit_alpha, fit_loc, fit_beta=stats.gamma.fit(io[0,:])
fit_alpha1, fit_loc1, fit_beta1=stats.gamma.fit(io[1,:])
I tried it with MATLAB, and it is working quite well. However, I do not understand what's happening in Python. Any help would be appreciated.
I tried with different library. I did not observe any different results.
Here is complete code to go along with my comment above. (In the future, please include any data in plain text.)
Here's the fit you were happy with:
For the other data, you get an unexpected large shape parameter.
Keep in mind that
fit
provides the unconstrained maximum likelihood estimate of the three parameter gamma distribution; it does not consider any other constraints or desires you might have if you don't specify them.If you are happy with a two-parameter gamma distribution, you can fix the location to zero with
floc=0
.If you want to put bounds on the fitted parameters, try
scipy.stats.fit
.That said, you might also consider whether your data is more consistent with a different distribution.