Using bartCause package to predict uplift

275 Views Asked by At

I'm trying to use the bartCause package to build an uplift model in R. Unfortunately I have trouble to integrate the data frame in the right way - error message:

$<-.data.frame`(`*tmp*`, "lift", value = c(0.159231848781688,  : 
  replacement has 160 rows, data has 2595

Code used:

 x = as.matrix(calibration[,-c(1:3)]) 
  y = calibration$churn
  z = calibration$treatment
    
  
  bart = bartc(y, z, x,
               method.trt = "bart", 
               method.rsp = "bart", 
               estimand="att", #average treatment effect on the treated
               n.samples = 20L, 
               n.chains = 8L, #Integer specifying how many independent tree sets and fits should be calculated.
               n.burn = 10L,
               n.threads = 4L, #Integer specifying how many threads to use for parallelization
               n.trees = 1000L,
               keepTrees = TRUE, #necessary for prediction!
               verbose = FALSE)

  pred_uplift <- predict(bart, validation[,-c(1:3)], combineChains = TRUE)
  pred <- pred_uplift
  validation$lift <- - pred[,1] + pred[,2]
  

calibration data: (2595 obs. of 15 variables) enter image description here

validation data: (2595 obs. of 15 variables) enter image description here

1

There are 1 best solutions below

1
On

Here is what I found about your code and the error message. The "predict" method in the package produces a matrix of (160 X N) where N is the number of cases in your validation dataset. Thus, each column in this matrix corresponds to a row in your validation dataset. You first need to transpose the matrix:

pred = t(pred_uplift)

Then, you can calculate the "lift" variable using whatever column you need from the "pred" matrix:

validation$lift = pred[,1] + pred[,2]

BTW: I have no idea about which column means what in the "pred" matrix and why you use the first two columns (I assume that you do), but the above code works.