How to estimate maximum likelihood with GEV in python

Question

How to estimate maximum likelihood with GEV in python

1.4k Views Asked by huon At 17 August 2025 at 16:54

I'm trying to match the generalized extreme value (GEV) distribution's probability density function (pdf) to the data' pdf. This histogram is function of bin. As adjust this bin, the result of the function fitting also changes. And curve_fit(func, x, y) is playing this role properly. but this function uses a "least squares estimation". What I want is to use maximum likelihood estimation (MLE). And it has good results with the stats.genextreme.fit(data)function. However, this function does not represent histogram shape changes according to bin. Just use the original data.

I'm trying to use MLE. And I succeeded in estimating the parameters of the standard normal distribution using MLE. However, it is based on the original data and does not change according to the bin. Even the parameters of the GEV could not be estimated with the original data.

I checked the source code of genextreme_gen, rv_continuous, etc. But, this code is too complicated. I couldn't accept the source code with my Python skills.

I would like to estimate the parameters of the GEV distribution through MLE. And I want to get the result that the estimate changes according to bin.

What should I do?

I am sorry for my poor English, and thank you for your help.

+)

h = 0.5  # bin width
dat = h105[1]  # data
b = np.arange(min(dat)-h/2, max(dat), h)  # bin range
n, bins = np.histogram(dat, bins=b, density=True)  # histogram
x = 0.5*(bins[1:]+bins[:-1])  # x-value of histogram

popt,_ = curve_fit(fg, x, n)  # curve_fit(GEV's pdf, x-value of histogram, pdf value)
popt = -popt[0], popt[1], popt[2]  # estimated paramter (Least squares estimation, LSE)
x1 = np.linspace((popt[1]-popt[2])/popt[0], dat.max(), 1000)
a1 = stats.genextreme.pdf(x1, *popt)  # pdf

popt = stats.genextreme.fit(dat) # estimated parameter (Maximum likelihood estimation, MLE)
x2 = np.linspace((popt[1]-popt[2])/popt[0], dat.max(), 1000)
a2 = stats.genextreme.pdf(x2, *popt)

bin width = 2

bin width = 0.5

Original Q&A

There are 1 best solutions below

**fdermishin** · Accepted Answer

One way to do this is to convert bins to data. You can do so by counting number of data points in each bin and then repeating center of the bin this number of times.

I have also tried to sample uniform values from each bin, but using center of the bin and then repeating it seems to provide parameters with higher likelihood.

import scipy.stats as stats
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt

ground_truth_params = (0.001, 0.5, 0.999)
count = 50

h = 0.2  # bin width
dat = stats.genextreme.rvs(*ground_truth_params, count)  # data
b = np.arange(np.min(dat)-h/2, np.max(dat), h)  # bin range
n, bins = np.histogram(dat, bins=b, density=True)  # histogram
bin_counts, _ = np.histogram(dat, bins=b, density=False)  # histogram
x = 0.5*(bins[1:]+bins[:-1])  # x-value of histogram
    
def flatten(l):
    return [item for sublist in l for item in sublist]

popt,_ = curve_fit(stats.genextreme.pdf, x, n, p0=[0,1,1])  # curve_fit(GEV's pdf, x-value of histogram, pdf value)
popt_lse = -popt[0], popt[1], popt[2]  # estimated paramter (Least squares estimation, LSE)
popt_mle = stats.genextreme.fit(dat) # estimated parameter (Maximum likelihood estimation, MLE)

uniform_dat_from_bins = flatten((np.linspace(x - h/2, x + h/2, n) for n, x in zip(bin_counts, x)))
popt_uniform_mle = stats.genextreme.fit(uniform_dat_from_bins) # estimated parameter (Maximum likelihood estimation, MLE)

centered_dat_from_bins = flatten(([x] * n for n, x in zip(bin_counts, x)))
popt_centered_mle = stats.genextreme.fit(centered_dat_from_bins) # estimated parameter (Maximum likelihood estimation, MLE)

plot_params = {
    ground_truth_params: 'tab:green',
    popt_lse: 'tab:red',
    popt_mle: 'tab:orange',
    popt_centered_mle: 'tab:blue',
    popt_uniform_mle: 'tab:purple'
}
param_names = ['GT', 'LSE', 'MLE', 'bin centered MLE', 'bin uniform MLE']

plt.figure(figsize=(10,5))

plt.bar(x, n, width=h, color='lightgrey')
plt.ylim(0, 0.5)
plt.xlim(-2,10)

for params, color in plot_params.items():
    x_pdf = np.linspace(-2, 10, 1000)
    y_pdf = stats.genextreme.pdf(x_pdf, *params) # the normal pdf
    plt.plot(x_pdf, y_pdf, label='pdf', color=color)
plt.legend(param_names)

plt.figure(figsize=(10,5))
for params, color in plot_params.items():
    plt.plot(np.sum(stats.genextreme.logpdf(dat, *params)), 'o', color=color)

This plot shows PDFs that are estimated using different methods along with ground truth PDF

And the next plot shows of likelihoods of estimated parameters given original data.

PDF that is estimated by MLE on original data has the maximum value as expected. Then follow PDFs that are estimated using histogram bin (centered and uniform). After them there is ground truth PDF. And finally comes PDF with the lowest likelihood, which is estimated using least squares.

How to estimate maximum likelihood with GEV in python

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in DATA-FITTING

Related Questions in PROBABILITY-DENSITY

Related Questions in MLE

Trending Questions

Popular # Hahtags

Popular Questions