Optimization block in python (scipy) - a histogram with a histogram

657 Views Asked by At

I need to fit an experimental histogram by a simulated one (to determine several parameters of the simulated one with which it fits best). I've tried curve_fit from scipy.optimize, but it does not work in this case: an error "... is not a python function" is returned. Is it possible to do this automatically in scipy or some other python module? If not, could you, please, give some links to probable algorithms to adjust them myself?

1

There are 1 best solutions below

0
On

From what you have said I think the following should help, it seems your trying to use curve_fit in the wrong way:

You need to define the distribution you are trying to fit. For example if I have some data that looks normally distributed and I want to know if how well, and what parameters give the best fit, I would do the following:

import numpy as np
import pylab as plt
from scipy.optimize import curve_fit

# Create fake data and run it though `histogram` to get the experimental distribution
experimental = np.random.normal(10.0, 0.4, size=10000)
n, bins = plt.histogram(experimental, bins=100, normed=True)

# This just gives the mid points of the bins, many different (and better) ways exist to
# do this I'm sure
bins_mid_points = (0.5*(bins + np.roll(bins, 1)))[1:]

# Define the normal distribution as a function
def normal(x, sigma, mu):
    return np.exp(-(x - mu)**2 / (2 * sigma**2)) / (sigma * np.sqrt(2*np.pi))

# Fit the experimental data, 
popt, pcov = curve_fit(normal, xdata=bins_mid_points, ydata=n)

# Plot both 
plt.bar(bins[:-1], n, width=np.diff(bins))
plt.plot(bins_mid_points, normal(bins_mid_points, *popt), color="r", lw=3)

The red line shows our simulated fit, if required you could also plot this as a histogram. The output of popt gives an array of [sigma, mu] which best fit the data while pcov could be used to determine how good the fit is.

Note that I normalised the data in histogram this is because the function I defined is the normal distribution.

You need to think carefully about what distribution you expect and what statistic your looking to get from it.