Bayes predict, subscript out of bounds

3.5k Views Asked by ch-pub At 16 June 2014 at 15:57

I'm having some problems with the predict function when using bayesglm. I've read some posts that say this problem may arise when the out of sample data has more levels than the in sample data, but I'm using the same data for the fit and predict functions. Predict works fine with regular glm, but not with bayesglm. Example:

control <- y ~ x1 + x2

# this works fine:
glmObject <- glm(control, myData, family = binomial())
predicted1 <- predict.glm(glmObject , myData, type = "response")

# this gives an error: 
bayesglmObject <- bayesglm(control, myData, family = binomial())
predicted2 <- predict.bayesglm(bayesglmObject , myData, type = "response") 
Error in X[, piv, drop = FALSE] : subscript out of bounds

# Edit... I just discovered this works. 
# Should I be concerned about using these results?
# Not sure why is fails when I specify the dataset
predicted3 <- predict(bayesglmObject, type = "response")

Can't figure out how to predict with a bayesglm object. Any ideas? Thanks!

Original Q&A

There are 1 best solutions below

Ravi On 16 June 2014 at 17:44 BEST ANSWER

One of the reasons could be to do with the default setting for the parameter "drop.unused.levels" in the bayesglm command. By default, this parameter is set to TRUE. So if there are unused levels, it gets dropped during model building. However, the predict function still uses the original data with the unused levels present in the factor variable. This causes differences in level between the data used for model building and the one used for prediction (even it is the same data fame -in your case, myData). I have given an example below:

    n <- 100
    x1 <- rnorm (n)
    x2 <- as.factor(sample(c(1,2,3),n,replace = TRUE))

    # Replacing 3 with 2 makes the level = 3 as unused
    x2[x2==3] <- 2

    y <- as.factor(sample(c(1,2),n,replace = TRUE))

    myData <- data.frame(x1 = x1, x2 = x2, y = y)
    control <- y ~ x1 + x2

    # this works fine:
    glmObject <- glm(control, myData, family = binomial())
    predicted1 <- predict.glm(glmObject , myData, type = "response")

    # this gives an error - this uses default drop.unused.levels = TRUE
    bayesglmObject <- bayesglm(control, myData, family = binomial())
    predicted2 <- predict.bayesglm(bayesglmObject , myData, type = "response") 

    Error in X[, piv, drop = FALSE] : subscript out of bounds

    # this works fine - value of drop.unused.levels is set to FALSE
    bayesglmObject <- bayesglm(control, myData, family = binomial(),drop.unused.levels   = FALSE)
    predicted2 <- predict.bayesglm(bayesglmObject , myData, type = "response")

I think a better way would be to use droplevels to drop the unused levels from the data frame beforehand and use it for both model building and prediction.

Bayes predict, subscript out of bounds

There are 1 best solutions below

Related Questions in R

Related Questions in PREDICTION

Related Questions in GLM

Related Questions in PREDICT

Related Questions in BAYESGLM

Trending Questions

Popular # Hahtags

Popular Questions