I would like to subset an ffdf object by index, returning another ffdf object.
The help file on subset.ffdf indicates that you can pass a range index (ri) object as an argument, but when I tried:
data_subset <- subset.ffdf(data, ri(1, 1e5))
I got this error:
Error in which(eval(e, nl, envir)) : argument to 'which' is not logical
Per You-Leee's suggestion, I tried passing a logical vector of the index of interest with this code:
n <- length(data[[1]]) #10.5 million
logical_index = c(1, 1e5) == seq.int(1, n)
data_subset <- subset(data, logical_index)
I tried to run it twice and each time my R-Studio crashed with the message R encountered a fatal error. The session was terminated. At first I thought it might be a memory constraint, but looking at my activity monitor, I still have 4gb available out of 8gb. And besides, this shouldn't be loading much into memory anyway.
The argument has to be logical, so you have to put TRUE on the desired indices and FALSE otherwise: