Require exactly one set of multiple arguments in R

1.1k Views Asked by At

I'm in the process of developing a framework in R and would like one of my functions to be multi-purposed. I would like to achieve this by requiring that exactly one set of arguments be passed. In other words, I would like to write a function foo which requires either the arguments x and y OR the argument a. If neither set is provided, if a set is incomplete, or if both sets are provided, an error should be thrown.

One way to achieve this is using only optional arguments, followed by if statements. This is demonstrated below. However, I would like to do this more elegantly.

foo <- function(x, y, a){
  if (!missing(a) & missing(x) & missing(y)){
    a
    return("Only a provided")
  } else if (missing(a) & !missing(x) & !missing(y)){
    x; y
    return("x and y provided")
  } else {
  stop("No complete and/or distinct argument set provided")
}

The function should work as follows:

> foo(a = 1)
[1] "Only a provided"

> foo(x = 1, y = 2)
[1] "x and y provided"

> foo()
Error in foo() : No complete and/or distinct argument set provided

> foo(x = 1)
Error in foo(x = 1) : No complete and/or distinct argument set provided

> foo(x = 1, y = 2, a = 3)
Error in foo(x = 1, y = 2, a = 3) : 
  No complete and/or distinct argument set provided

Extra credit for also including a generalized answer that can handle any number of argument sets of any size.

Aside: the above example uses missing() and no argument defaults, but this is by no means a requirement. I am flexible on using various formats so long as they offer a good solution to the question at hand.

2

There are 2 best solutions below

0
On BEST ANSWER

From my comment, two thoughts.

missing

This is essentially your approach, slightly modified (mostly for style and/or readability, just aesthetics):

foo1 <- function(x, y, z, a, b) {
  # first argument set
  allFirst <- ! any(missing(x), missing(y), missing(z))
  anyFirst <- any(! missing(x), ! missing(y), ! missing(z))
  # second argument set
  allSecond <- ! any(missing(a), missing(b))
  anySecond <- any(! missing(a), ! missing(b))

  if ( (allFirst && anySecond) ||
         (allSecond && anyFirst))
    stop("provide either arguments x,y,z or a,b", call. = FALSE)
  if ( (anyFirst && ! allFirst) ||
         (anySecond && ! allSecond) )
    stop("no complete and/or distinct argument set provided", call. = FALSE)

  if (allFirst) {
    return("x,y,z provided")
  } else if (allSecond) {
    return("a,b provided")
  } else {
    stop("nothing provided", call. = FALSE)
  }  
}
foo1(a = 1, b = 2)
# [1] "a,b provided"
foo1(x = 1, y = 2, z = 3)
# [1] "x,y,z provided"
foo1()
# Error: nothing provided
foo1(x = 1)
# Error: no complete and/or distinct argument set provided
foo1(a = 1)
# Error: no complete and/or distinct argument set provided
foo1(x = 1, b = 2)
# Error: no complete and/or distinct argument set provided

S3 method dispatch

This only works if the argument sets are distinguished by difference classes. For example, if x is a data.frame and a is a list, then ...

Note that the first definition (which enables the others) sets the common arguments, so all functions need to use x as the first argument:

foo2 <- function(x, ...) UseMethod("foo2", x)
foo2.data.frame <- function(x, y, z) {
  if (missing(y) || missing(z)) stop("no complete and/or distinct argument set provided for 'x'", call. = FALSE)
  return("x,y,z provided")
}
foo2.list <- function(x, b, ...) {
  if (missing(b)) stop("no complete and/or distinct argument set provided for 'a'", call. = FALSE)
  return("a,b provided")
}

... so we cannot use function(a, b) in the formal definition.

foo2(x = data.frame(), y = 1, z = 2)
# [1] "x,y,z provided"
foo2(x = list(), b = 1)
# [1] "a,b provided"
foo2(data.frame())
# Error: no complete and/or distinct argument set provided for 'x'
foo2(x = list())
# Error: no complete and/or distinct argument set provided for 'a'
foo2()
# Error in foo2() (from #1) : argument "x" is missing, with no default
foo2(x=data.frame(), b=2)
# Error in foo2.data.frame(x = data.frame(), b = 2) (from #1) : 
#   unused argument (b = 2)

The use of ellipses ... in the first function is required, but in the other two functions it is a little stylistic and might not be necessary, since it allows some arguments to be passed to other companion/dependent functions.

The error messages should be a little more descriptive here, since (at a minimum) all of the functions will be assuming a first argument of x (instead of a).

This option is exercising something called polymorphism, where the function behaves significantly differently based on the class of data provided. This is reduced a little if it always returns the same type of object, but even then some find it undesirable.

Note that many standard R functions use this dispatch, including c, print, and str.

0
On

Another option to the approaches shown by @r2evans would be to use ... to dispatch:

foo <- function(...) {
    args <- list(...)
    if("a" %in% names(args) && "x" %in% names(args) && y %in% names(args))
        stop("need either 'a' or 'x' and 'y' arguments.")
    if("a" %in% names(args)) return(foo_a(a=args[["a"]]))
    if("x" %in% names(args) && "y" %in% names(args)) return(foo_xy(x=args[["x"]], y=args[["y"]])
    stop("need either 'a' or 'x' and 'y' arguments.")
}

You would need to define foo_a and foo_xy to do the actual computing. The drawback of this approach is that it only works with named arguments; calling foo(2,3), instead of foo(x=2, y=3) would result in an error. This could be resolved looking at the length of args in the code above, however that gets quickly messy if you deal with more and more parameters.

Yet another option would be to collect the argument sets into (S3 or S4) objects, and dispatch on the argument set classes, like this

xy_arg <- function(x,y) {
    ans <- list(x=x, y=y)
    class(ans) <- "xy_arg"
    return(ans)
}
a_arg - function(a) {
    ans <- list(a=a)
    class(ans) <- "a_arg"
    return(ans)
}
foo <- function(x, ...) UseMethod("foo", x)
foo.xy_arg <- function(x, ...) { 
    #compute for argument set where x and y is given
    print(x[["x"]], x[["y"]])
}
foo.a_arg <- function(x, ...) 
    print(x[["a"]])
}
foo(xy_arg(x=1, y=2))
foo(a_arg(a=3))

This looks more complicated at first, however it allows to define more argument sets in a systematic way. `

It would also be conceivable to define foo to work only on one parameter set and use xy_arg and a_arg to build a normalized interface object, i.e. perform a problem transformation from (x,y) or (a) to a canonical problem.