Subsetting a custom S4 class using the "subset" function within another function

163 Views Asked by At

I'm trying to define a subset method for my custom S4 class. While subsetting works as intended when I provide the subsetting critearia directly to subset, the method fails whenever I call it within another function, where the subsetting criteria is passed on to subset from that function.

The S4 class myClass consists of a single data.frame:

# Define class
setClass("myClass", slots = c(data = "data.frame"))

# Initiate a myClass object
dat <- new("myClass", data = data.frame(Letter = c("A", "A", "B"), Number = c(1, 2, 3)))

To be able to subset the class based on the content of the data.frame in the slot data, I defined the following subsetmethod:


setMethod("subset", signature(x = "myClass"), function(x, ...) {
  x@data <- subset(x@data, ...)
  return(x)
})

The method works as expected when called as follows:

# Assume we only want to retain entries containing the letter "A"
whichletter <- "A"

# Subset (does work)
subset(dat, Letter %in% whichletter)
An object of class "myClass"
Slot "data":
  Letter Number
1      A      1
2      A      2

However, when I try to run subset within another function, where the subset criteria is provided through that function's arguments, the subsetting won't work:

# Random function that takes a letter `let`as argument
randomFunction <- function(object, let) {
  object_subsetted <- subset(object, Letter %in% let)
  return(object_subsetted)
}

# Subset (does not work)
randomFunction(object = dat, let = whichletter)
Error in Letter %in% let: object 'let' not found

This appears to be an issue with environments but I can't figure out what exactly is going wrong. Does anyone have a suggestion how to avoid this error?

4

There are 4 best solutions below

1
DoDx9 On BEST ANSWER

I just found this and this which together seem to answer my question. The issue is unrelated to the use of S4 classes, but caused by how subset scopes variables. Initially, my definition of subset was looking for let inside the x@dataobject. By specifically defining to evaluate the expression in the scope of x@data but allowing to expand the scope to the parent.frame() (to which the variable letbelongs to) solves the error and allows me to subset as desired.

Here is the full code:

# Define class
setClass("myClass", slots = c(data = "data.frame"))

# Initiate a myClass object
dat <- new("myClass", data = data.frame(Letter = c("A", "A", "B"), Number = c(1, 2, 3)))

# Define a subset method
setMethod("subset", signature(x = "myClass"), function(x, ...) {
  condition <- substitute(...)
  indices <- eval(condition, x@data, parent.frame())
  x@data <- x@data[indices, ]
  return(x)
})

# Suppose we want to subset to only retain entries with "A"
whichletter <- "A"

# What if we have a function that should pass the subsetting value to the subset
# function?
randomFunction <- function(object, let) {
  object_subsetted <- subset(object, Letter %in% let)
  return(object_subsetted)
}

# Test it (works now)
randomFunction(object = dat, let = "A")
randomFunction(object = dat, let = "B")
An object of class "myClass"
Slot "data":
  Letter Number
1      A      1
2      A      2

An object of class "myClass"
Slot "data":
  Letter Number
3      B      3

However, as highlighted by @JDL it's probably wiser to define a [ method instead.

0
Carl Witthoft On

Here is part of the answer. I find this strange but clearly mucking with the unnamed arguments allow eval when called from subset to work as desired.

Inside debug mode, I entered two commands. Here are the results:

Browse[2]> subset(object@data,Letter %in% let, envir='CurrentEnvironment')
  Letter Number
1      A      1
2      A      2
Warning message:
In subset.data.frame(object@data, Letter %in% let, envir = "CurrentEnvironment") :
 extra argument ‘envir’ will be disregarded 
Browse[2]> subset(object, Letter %in% let)
Error in eval(e, x, parent.frame()) : object 'let' not found

I did this because I ran traceback when your function failed:

Rgames> randomFunction(object = dat, let = whichletter)
Error in eval(e, x, parent.frame()) : object 'let' not found
Rgames> traceback()
9: Letter %in% let
8: eval(e, x, parent.frame())
7: eval(e, x, parent.frame())
6: subset.data.frame(x@data, ...)
5: subset(x@data, ...)
4: subset(x@data, ...) at #2
3: subset(object, Letter %in% let)
2: subset(object, Letter %in% let) at #4
1: randomFunction(object = dat, let = whichletter)

So this means that somehow subset.data.frame is not properly passing let to the eval call -- or that maybe you need to pass an arg like

subset(object, Letter %in% let, parent.frame(2))

I haven't tested that; I'll let folks who are much more familiar with environment passing than I comment further. My guess is that the method subset.data.frame isn't "happy" with passing extra arguments?

2
JDL On

As other answers have alluded to, there are problems caused by subset.data.frame using non-standard evaluation on its argument. You can't use NSE on an argument which is part of a signature, since it is evaluated during method dispatch.

The help page for subset also says:

This is a convenience function intended for use interactively. For programming it is better to use the standard subsetting functions like [, and in particular the non-standard evaluation of argument subset can have unanticipated consequences.

It may be better to write an appropriate method for [, and rely on subset.default which is defined as x[subset & !is.na(subset)].

0
Hieu Nguyen On

OP's answer is basically defining a brand new calculation based on subset.data.frame. The following approach focus on the non-standard evaluation (NSE) aspect and use the normal S4 dispatching mechanism to reuse existing subset function:

# Same as OP
setClass("myClass", slots = c(data = "data.frame"))
dat <- new("myClass", data = data.frame(Letter = c("A", "A", "B"), Number = c(1, 2, 3)))
setMethod("subset", signature(x = "myClass"), function(x, ...) {
    x@data <- subset(x@data, ...)
    return(x)
})

# Use substitute, bquote or other means to construct the call then evaluate
randomFunction <- function(object, let) {
    ee <- substitute(subset(object, Letter %in% let))
    # ee <- bquote(subset(object, Letter %in% .(let)))
    
    print(deparse(ee))
    object_subsetted <- eval(ee)
    return(object_subsetted)
}

whichletter <- "A"
randomFunction(object = dat, let = whichletter) # For ee <- substitute(subset(object, Letter %in% let))
[1] "subset(dat, Letter %in% whichletter)"
An object of class "myClass"
Slot "data":
  Letter Number
1      A      1
2      A      2

randomFunction(object = dat, let = whichletter) # For ee <- bquote(subset(object, Letter %in% .(let)))
[1] "subset(object, Letter %in% \"A\")"
An object of class "myClass"
Slot "data":
  Letter Number
1      A      1
2      A      2

The problem you encountered with subset here was discussed in this answer