curve fitting zipf distribution matplotlib python

3.3k Views Asked by At

I tried to fit the following plot(red dot) with the Zipf distribution PDF in Python, F~x^(-a). I simply chose a=0.56 and plotted y = x^(-0.56), and I got the curve shown below.

The curve is obviously wrong. I don't know how to do the curve fitting.

enter image description here

2

There are 2 best solutions below

0
On

Not sure what you are exactly looking for, but if you want to fit a model (function) to data, use scipy.optimize.curve_fit:

from scipy.optimize import curve_fit
from scipy.special import zetac


def f(x, a):
    return (x**-a)/zetac(a)


result = curve_fit(f, x, y, p0=[0.56])
p = result[0]

print p

If you don't trust the normalization, add a second parameter b and fit that as well.

0
On

You need an intercept in your loglog plot, right now it is 0.

That frequency follows the inverse rank implies that there is a ratio K between the frequency and the inverse rank, so you need to fit:

F~x^(-a) => F = k/(x^a)