AttributeError: 'ModelData' object has no attribute 'design_info'

572 Views Asked by At

I am unable to perform annova on a linear regression model. I have attached the simplified code below. Please let me know how to fix it.

import statsmodels.api as sm
import numpy as np

# define the data
x1 = np.random.rand(100)
x2 = np.random.rand(100)
y = 2*x1 + 3*x2 + np.random.normal(size=100)

# build the model with all independent variables
X = sm.add_constant(np.column_stack((x1, x2)))
model = sm.OLS(y, X).fit()

# perform the F-test
f_value, p_value, _ = sm.stats.anova_lm(model, typ=1)

Error screen shot: enter image description here

1

There are 1 best solutions below

0
On

From anova_lm documentation:

Notes

Model statistics are given in the order of args. Models must have been fit using the formula api.

You would need to use the formula API when defining your model

import statsmodels.api as sm
import statsmodels.formula.api as smf
import numpy as np

x1 = np.random.rand(100)
x2 = np.random.rand(100)
y = 2*x1 + 3*x2 + np.random.normal(size=100)
X = sm.add_constant(np.column_stack((x1, x2)))

model = smf.ols("y ~ X", data={"y": y, "X": X}).fit()

print(sm.stats.anova_lm(model, typ=1))

Additionally I'm not sure why you're adding a constant in X. In this way your fitted model would end up having two intercepts. I guess what you were actually trying to achieve was

import numpy as np
import statsmodels.formula.api as smf

X = np.random.rand(100, 2)
y = 2 * X[:, 0] + 3 * X[:, 1] + np.random.normal(size=100)

model = smf.ols("y ~ X", data={"y": y, "X": X}).fit()

print(model.params)