I am trying to estimate a self-defined likelihood function
import numpy as np
import pandas as pd
from scipy.optimize import minimize
def lik(params):
p_s = params[0]
lamda_s=params[1]
lamda_bl=params[2]
phi_bl=params[3]
p_hat=params[4]
a1_bl = 4 * df["s1"] + 1 * df["s2"] + 6 * df["s3"]
a2_bl = 0 * df["s1"] + 2 * df["s2"] + 8 * df["s3"]
a3_bl = 10 * df["s1"] + 4 * df["s2"] + 0 * df["s3"]
p1_bl = np.exp(lamda_bl*(a1_bl-a2_bl))/(np.exp(lamda_bl * (a1_bl - a2_bl))+1 + np.exp(lamda_bl*(a3_bl-a2_bl)))
p2_bl = 1 / (np.exp(lamda_bl*(a1_bl-a2_bl))+1+np.exp(lamda_bl * (a3_bl-a2_bl)))
p3_bl = np.exp(lamda_bl*(a3_bl-a2_bl))/(np.exp(lamda_bl * (a1_bl - a2_bl))+1 + np.exp(lamda_bl*(a3_bl-a2_bl)))
df['f_bl'] = p1_bl*y1+p2_bl*y2+p3_bl*y3
s1_s = p1_s * p_hat + p1_bl * (1- p_hat)
s2_s = p2_s * p_hat + p2_bl * (1 - p_hat)
s3_s = p3_s * p_hat + p3_bl * (1 - p_hat)
a1_s = 4 * s1_s + 1 * s2_s + 6 * s3_s
a2_s = 0 * s1_s + 2 * s2_s + 8 * s3_s
a3_s = 10 * s1_s + 4 * s2_s + 0 * s3_s
p1_s = np.exp(lamda_s*(a1_s-a2_s))/(np.exp(lamda_s * (a1_s - a2_s)) + 1 + np.exp(lamda_s*(a3_s-a2_s)))
p2_s = 1 / (np.exp(lamda_s*(a1_s-a2_s))+1+np.exp(lamda_s * (a3_s-a2_s)))
p3_s = np.exp(lamda_s*(a3_s-a2_s))/(np.exp(lamda_s * (a1_s - a2_s))+1 + np.exp(lamda_s*(a3_s-a2_s)))
df['f_s'] = p1_s*y1+p2_s*y2+p3_s*y3
pcodes = df["pcode"].unique()
ff_bl = np.array([np.prod(df['f_bl'][df['pcode'] == i]) for i in pcodes])
ff_s = np.array([np.prod(df['f_s'][df['pcode'] == i]) for i in pcodes])
pp = p_s * ff_s + (1-p_s)*ff_bl
li=np.log(pp)
LL=np.sum(li)
return -LL
results = minimize(lik,np.array([0.6, 0.6, 0.1, 0.3,0.2]),method= 'L-BFGS-B')
print(results)
s1_s, a1_s and p1_s are interdependent, and so I got an error UnboundLocalError: local variable 'p1_s' referenced before assignment.
Is there a way to write such likelihood function without solving it myself?
You need to add one of the variable series in your cyclic dependency as degrees of freedom, and then add a nonlinear constraint that tells the solver to optimize while holding the equality true.
There's a lot of code in your program that is either under-vectorised or mis-vectorised but I cannot help with most of it, because you're missing a pile of variables in your question and so I've had to fill in numerous ("bogus") gaps to make this reproducible. It's critical that you add reasonable bounds.