I'm inspecting tests/usersplits.R in rpart
, trying to get my head around how to pass data to the split function.
I'm trying to instantiate a number of goodness measures which compare treatment vs control within splits.
A simple example is difference-in-difference estimate of a candidate split,
(Y_t - Y_c)_L - (Y_t - Y_c)_R (difference between treatment and control for the left minus difference between treatment and control for the right)
I need to know whether each Y value is a treatment Y or a control. the documentation in usersplits.R says that Y is provided in sort order of X, so I'm not sure how to pass a vector indicating treatment vs control that will split Y appropriately.
rpart.poisson
takes a two column matrix as an input, and I've been trying to mimic that, but there's no documentation in rpart.poisson
(I can't figure out where it creates the 'vector of goodness')