R party conditional varimp error

1k Views Asked by At

I have a data set with 6 predictor variables (all of which are categorical), a response variable and a column for the weights, and ~3500 observations. The levels that the predictor variables have vary from 2 to 7.

I have defined indicator variables for the levels of each predictor variable, for example

retail <- Trade == "RETAIL"

Where Trade is one of the "main" variables and retail is a value it can take.

I run into problems when trying to calculate the conditional variable importance using:

rf <- cforest(Actual ~ comp + tpft + abi1 + abi2 + 
              abi3 + abi4 + abi5 + abi6 + abio + builders + 
              clerical + manufacturing + othertrade + retail + 
              tradeunk + wholesale + firstrenewal + newbusiness + 
              renewedtwice + MTyes + MTno + ly9 + ly10 + ly11 + ly12 + ly13, 
              data=table, weights=Expected, controls=data.controls)

imp <- varimp(rf, conditional=TRUE)

Where each of the comp,tpft, etc., are the categories that the main variables could take.

This returns the error:

Error in names(blocks) <- cond : 
'names' attribute [24] must be the same length as the vector [12]

And I have no idea how to fix it! traceback gives:

> traceback()
2: conditional_perm(ccl, xnames, input, tree, oob)
1: varimp(rf, conditional = TRUE)

This method works when I only want to test the 6 main variables, so I'm fairly sure it's a problem with the number of indicator variables not matching the number of something else, and having conditional=FALSE with the indicator variables also works. Any help would be hugely appreciated.

1

There are 1 best solutions below

0
On

I had the same error and after some experimenting with my data, I found it only happened if logical predictor variables were included. Converting the logical variables to numeric solved the problem for me. You don't say that your predictors are logical variables, but perhaps it's a direction to look.