I'm trying to use the bartCause package to build an uplift model in R. Unfortunately I have trouble to integrate the data frame in the right way - error message:
$<-.data.frame`(`*tmp*`, "lift", value = c(0.159231848781688, :
replacement has 160 rows, data has 2595
Code used:
x = as.matrix(calibration[,-c(1:3)])
y = calibration$churn
z = calibration$treatment
bart = bartc(y, z, x,
method.trt = "bart",
method.rsp = "bart",
estimand="att", #average treatment effect on the treated
n.samples = 20L,
n.chains = 8L, #Integer specifying how many independent tree sets and fits should be calculated.
n.burn = 10L,
n.threads = 4L, #Integer specifying how many threads to use for parallelization
n.trees = 1000L,
keepTrees = TRUE, #necessary for prediction!
verbose = FALSE)
pred_uplift <- predict(bart, validation[,-c(1:3)], combineChains = TRUE)
pred <- pred_uplift
validation$lift <- - pred[,1] + pred[,2]
calibration data: (2595 obs. of 15 variables) enter image description here
validation data: (2595 obs. of 15 variables) enter image description here
Here is what I found about your code and the error message. The "predict" method in the package produces a matrix of (160 X N) where N is the number of cases in your validation dataset. Thus, each column in this matrix corresponds to a row in your validation dataset. You first need to transpose the matrix:
Then, you can calculate the "lift" variable using whatever column you need from the "pred" matrix:
BTW: I have no idea about which column means what in the "pred" matrix and why you use the first two columns (I assume that you do), but the above code works.