Inconsistent ATE Estimation - EconML

22 Views Asked by At

I tried the usecase mentioned in DoubleML Impact of 401k DoubleML usecase. In DoubleML, they arrived at a conclusion saying that that participation on the pension scheme has a significant positive effect on financial assets. I also got similar results mentioned by them.

But when i tried to reproduce the same example using EconML, i get a negative treatment effect

from doubleml.datasets import fetch_401K
data = fetch_401K(return_type='DataFrame')
X = data[['age', 'p401', 'educ', 'fsize', 'marr','twoearn', 'db', 'pira', 'hown']].values
Y = data['net_tfa'].values
T = data['e401'].values
X_train, X_test, y_train, y_test, T_train, T_test = train_test_split(X, Y, T, test_size=0.2, random_state=42)
est_401 = CausalForestDML(model_y=RandomForestRegressor(),
                model_t=RandomForestClassifier(min_samples_leaf=10),
               discrete_treatment=True, cv=3
                )
est_401.fit(y_train, T_train, X=X_train, W=None, cache_values=True)
print(est_401.ate(X, T0=0, T1=1))
print(est_401.ate_interval(X,T0=0, T1=1))

Output: -3561.8794269648424 (-55233.03792924924, 48109.27907531955)

Please could anyone point out what am i doing wrong here?

0

There are 0 best solutions below