How can you carry out ordinal regression/classification using tidymodels or caret in R?

578 Views Asked by At

I know that the R packages ``tidymodelsandcaret``` both support multinomial classification. However, does either package support ordinal regression/classification? Here is what I mean by "ordinal regression/classification":

  • Regular regression: estimate numbers, e.g., numeric values ranging from 1 to 250
  • Binary classification: estimate binary values, e.g., 0 or 1; positive or negative; yes or no
  • Multinomial classification: estimate one of more than two categories, e.g., rock|paper|scissors; red|white|green. Crucially, there is no natural order among these categories.
  • Ordinal regression/classification: estimate ordered categories, e.g., 1st|2nd|3rd; gold|silver|bronze|no-medal; very-low|low|medium|high|very-high. Crucially, there IS a natural order among the categories.

Ordered categories are either treated as classification or as regression tasks, both of which are inappropriate. They are often treated as regression problems by converting the categories to numbers (numeric type in R), but this is not always appropriate, especially in cases where there is no consistently meaningful interval distance between categories or decimal fractions between the categories are meaningless in the real world. (E.g., if gold|silver|bronze|no-medal is converted to 3|2|1|0, then what would an estimated score of 2.5 mean? Is the distance between gold and silver the same as the distance between bronze and no medal?) Ordered categories are also often treated as classification problems by converting them as an unordered factor in R and then simply carrying out multinomial classification. But this is also unsatisfactory because such treatment completely ignores the order among the categories, thus losing a vital part of real-world information that should help improve the performance of the classification.

So, the appropriate technique for estimating such outcomes is to encode them as ordered factors in R and then to apply ordinal regression (also called ordinal classification), a form of logistic regression that is adapted to ordinal categorical outcomes. R implements ordinal regression with the MASS::polr and rms::orm packages, among others. This is a classification technique that preserves and uses the order information of the outcome variable to improve the performance of the estimates beyond what regular regression or multinomial classification can do.

However, these packages are normally used for statistical analyses rather than for machine learning. Machine learning frameworks in R like tidymodels and caret do not seem to have interfaces for them. So, is there any built-in support for ordinal regression/classification in ``tidymodelsandcaret``` or another machine learning framework in R?

1

There are 1 best solutions below

0
On BEST ANSWER

There is an entire section this book devoted to ordinal outcomes in caret. Have a look there.