I have found two method to calculate VIF but through both method my end result VIF value is very different. Why?
first method -
X is independent variables.
def comput_vif(considered_features):
vif = pd.DataFrame()
vif["Variable"] = X.columns
vif["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
return vif
considered_features = list(X.columns)
compute_vif(considered_features ).sort_values('VIF', ascending=False)
Second method -
considered_features is independent variables.
def compute_vif(considered_features):
X = df[considered_features]
X['intercept'] = 1
vif = pd.DataFrame()
vif["Variable"] = X.columns
vif["VIF"] = [variance_inflation_factor(X.values, i) for i in range(X.shape[1])]
vif = vif[vif['Variable']!='intercept']
return vif
considered_features = list(df.columns)
compute_vif(considered_features).sort_values('VIF', ascending=False)