How to pass a formula object to DirichReg (setting up for function)

311 Views Asked by At

I am trying to pass a formula object to a Dirichlet Regression, using the DirichReg package in R. As shown below, the package does not seem able to accept formulas in this format, but nothing in the documentation notes this limitation. The reason for this workflow is that I am trying to set up a cross-validation function that can apply over a list of different formulas (IE with different covariates) and return the out-of-sample predictive ability to help with model selection.

library (DirichletReg)

df <- ArcticLake  # plug-in your data here
df$Y <- DR_data(df[,1:3])  # prepare the Y's
Warning in DR_data(df[, 1:3]) :
  not all rows sum up to 1 => normalization forced

formula <- reformulate(termlabels = "depth", response="Y")

mod <- DirichReg(formula, df)

Error: object of type 'symbol' is not subsettable
Error during wrapup: 

mod <- DirichReg(Y~depth, df)

str(Y~depth)

Class 'formula'  language Y ~ depth
  ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 

str(formula)

Class 'formula'  language Y ~ depth
  ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 

formula <- as.formula("Y~depth")
mod <- DirichReg(formula, df)

Error: object of type 'symbol' is not subsettable
Error during wrapup: 

There doesn't seem to be any difference between my 'formula' object and the formula as specified in the working DirichReg call.

My guess is that it has something to do with way that the response variable is formatted using the DR_data command, but I can't figure out a way to get around this to specify formulas on the fly in a function.

> str(df$Y)
 DirichletRegData [1:39, 1:3] 0.775 0.719 0.507 0.524 0.7 ...
 - attr(*, "dimnames")=List of 2
  ..$ : chr [1:39] "1" "2" "3" "4" ...
  ..$ : chr [1:3] "sand" "silt" "clay"
 - attr(*, "Y.original")='data.frame':  39 obs. of  3 variables:
  ..$ sand: num [1:39] 0.775 0.719 0.507 0.522 0.7 0.665 0.431 0.534 0.155 0.317 ...
  ..$ silt: num [1:39] 0.195 0.249 0.361 0.409 0.265 0.322 0.553 0.368 0.544 0.415 ...
  ..$ clay: num [1:39] 0.03 0.032 0.132 0.066 0.035 0.013 0.016 0.098 0.301 0.268 ...
 - attr(*, "dims")= int 3
 - attr(*, "dim.names")= chr [1:3] "sand" "silt" "clay"
 - attr(*, "obs")= int 39
 - attr(*, "valid_obs")= int 39
 - attr(*, "normalized")= logi TRUE
 - attr(*, "transformed")= logi FALSE
 - attr(*, "base")= num 1
3

There are 3 best solutions below

0
On BEST ANSWER

@Smiley Bcc may have been hinting at this, but it appears that you have to call as.formula() from within the DirichletReg() function. From your example data above:

> f <- as.formula('Y~depth')
> mod <- DirichReg(f, df)
Error: object of type 'symbol' is not subsettable

> f <- 'Y~depth'
> mod <- DirichReg(as.formula(f), df)

Interestingly, it doesn't work (probably for different reasons) when you literally name the object "formula":

> formula <- 'Y~depth'
> mod <- DirichReg(as.formula(formula), df)
Error: object of type 'closure' is not subsettable

I assume there's some kind of direct reference to an object called formula inside the DirichletReg() function, so avoid calling it specifically that.

0
On

Additionally, if trying to use @dmp's workaround in a function, one needs to assign the formula object to the global environment.

See issue:

library (DirichletReg)

df <- ArcticLake  # plug-in your data here
df$Y <- DR_data(df[,1:3])  # prepare the Y's

f <- reformulate(termlabels = "depth", response="Y")

mod <- DirichReg(f %>% as.formula, df)

runReg <- function(this.formula, data) {

  message(this.formula)

  mod <- DirichReg(as.formula(this.formula), data)

  return(mod)

}

res <- runReg("Y~depth", df)

Y~depth Error in as.formula(this.formula) : object 'this.formula' not found

And solution:

runReg <- function(this.formula, data) {

  message(this.formula)

  f <<- this.formula


  mod <- DirichReg(as.formula(f), data)

  return(mod)

}

res <- runReg("Y~depth", df)

This seems to be a pretty hacky way to go about this that could introduce dangerous namespace conflicts, so I would be interested to see if anyone else has ideas for other solutions.

1
On

You can try instead passing character then later cast to formula with as.formula

as.formula("z ~ x + y")