Create indicator variables within a list

84 Views Asked by At

I have a list containing sequences of numbers. I want to create a list that indicates all non-zero elements up to the first element that matches a defined limit. I also want to create a list that indicates all non-zero elements after the first element to match the defined limit.

I prefer a base R solution. Presumably the solution will use lapply, but I have not been able to come up with a simple solution.

Below is a minimally reproducible example in which the limit is 2:

my.limit <- 2
my.samples  <- list(0,c(1,2),0,c(0,1,1),0,0,0,0,0,c(1,1,2,2,3,4),c(0,1,2),0,c(0,0,1,1,2,2,3))

Here are the two desired lists:

within.limit  <- list(0,c(1,1),0,c(0,1,1),0,0,0,0,0,c(1,1,1,0,0,0),c(0,1,1),0,c(0,0,1,1,1,0,0))
outside.limit <- list(0,c(0,0),0,c(0,0,0),0,0,0,0,0,c(0,0,0,1,1,1),c(0,0,0),0,c(0,0,0,0,0,1,1))
4

There are 4 best solutions below

2
On BEST ANSWER

We can use match with nomatch argument as a very big number (should be greater than any length of the list, for some reason I couldn't use Inf here.)

within.limit1 <- lapply(my.samples, function(x) 
                 +(x > 0 & seq_along(x) <= match(my.limit, x, nomatch = 1000)))

outside.limit1 <- lapply(my.samples, function(x) 
                    +(seq_along(x) > match(my.limit, x, nomatch = 1000)))

Checking if output is correct to shown one :

all(mapply(function(x, y) all(x == y), within.limit, within.limit1))
#[1] TRUE
all(mapply(function(x, y) all(x == y), outside.limit, outside.limit1))
#[1] TRUE
0
On
foo <- function(samples, limit, within = TRUE) {
  `%cp%` <- if (within) `<=` else `>`
  lapply(samples, function(x) pmin(x, seq_along(x) %cp% match(my.limit, x, nomatch = 1e8)))
}

> all.equal(foo(my.samples, my.limit, FALSE), outside.limit)
# [1] TRUE
> all.equal(foo(my.samples, my.limit, TRUE), within.limit)
# [1] TRUE
0
On

We can use findInterval

lapply(my.samples, function(x) 
          +(x > 0 & seq_along(x) <= findInterval(my.limit, x)-1))

and

lapply(my.samples, function(x)  +(seq_along(x) > findInterval(my.limit, x)-1))
0
On

I would do

within.limit <- lapply(my.samples, function(x) 
                           +(x!=0 & (x<limit | cumsum(x == limit)==1)))
outside.limit <- lapply(my.samples, function(x) 
                           +(x!=0 & (x>limit | cumsum(x == limit)>1)))