How to fit discrete distribution (boltzmann) to large dataset?

194 Views Asked by At

I have a NumPy array with a large number of data points = 53046323. The data represent durations and follow discrete distributions, after a search I believe it can fit Boltzmann. I did several trials to estimate the best parameters of the distribution to fit the data and the best was with lambda=1

sa1=np.load('all_4_daily_consec_count_baseline_list.npy',allow_pickle=True)
nn=sa1.tolist()
data=np.concatenate(nn)
plt.hist(data, bins=int(np.max(data)), density=True, alpha=0.5)
plt.plot(data, boltzmann.pmf(data,1,53046322), 'go', markersize=9)

But is not suiting all values as in figure (test_boltzmann_full) test_boltzmann_full

So I tried to fit part of the data as in figure (test2), which has the same distribution shape and number of data points 585

sa1=np.load('all_4_daily_consec_count_baseline_list.npy',allow_pickle=True)
xx=np.reshape(sa1,(607,484))
noov_h=xx[306,250]
noov_hh=noov_h.astype('float')
data=noov_hh[~np.isnan(noov_hh)]
plt.hist(data, bins=int(np.max(data)), density=True, alpha=0.5)
plt.plot(data, boltzmann.pmf(data,1,584), 'go', markersize=9)

test2

May I know how to get the best-fitting parameters to fit my data? in both cases? is there an adaptive way to get better fitting?

The data is in the link as its large to be uploaded here data

1

There are 1 best solutions below

0
On

scipy.stats.fit can be used to fit distribution parameters to data. Here's an example of fitting parameters of the Boltzmann distribution to data sampled from a Boltzmann distribution.

import numpy as np
from scipy import stats
rng = np.random.default_rng()

# generate data distributed according to a Boltzmann distribution
dist = stats.boltzmann
shapes = (1.4, 19)
data = dist.rvs(*shapes, size=1000, random_state=rng)

# fit parameters of the distribution to date
bounds = dict(lambda_=(0.1, 10), N=(1, 100))  # bounds on parameter values
res = stats.fit(dist, data, bounds=bounds)

$ compare histogram against fitted PDF
res.plot()

enter image description here