I was curious if there is an as_formula specifier (like in statsmodels
) for sklearn.tree.decisiontreeclassifier
in Python, or some way to hack one in. Currently, I must use
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X, Y)
but I would prefer to have something like
clf = clf.fit(formula='Y ~ X', data=df)
The reason is that I would like to specify more than one X without having to do a lot of array shaping. Thanks.
It's currently not possible, but it would be great to have a patsy interface for scikit-learn. I don't think anyone is working on it at the moment, though.