assigning categoical values to multiple variables in r

82 Views Asked by At

Say I have the following data input into R

x <- c(1,1,0,0,0,0)
y <- c(1,0,1,0,0,0)
z <- c(0,0,0,0,1,1)
p <- c(0,0,0,1,1,0)

data <- data.frame(x,y,z,p)

Now I want to introduce a new variable in data called 'cat'

within cat I want to assign 'a' values to any observation where 1 appears in either x or y o appears in both. I want to assign the value 'b' to observations where 1 appears in either/both of z and p.

2

There are 2 best solutions below

1
On BEST ANSWER
c("b", "a")[(!!rowSums(data[,1:2])) +0 + (!!rowSums(data[,3:4])+1)]
#[1] "a" "a" "a" "b" "b" "b"
  • Assuming that I understand the condition and also assuming that there won't be any intersecting cases i.e. rows of either or both x, y that are 1 doesn't have an intersect with z or p having 1 values
  • As a first step, I did rowSums on columns x and y

    rowSums(data[,1:2])
    #[1] 2 1 1 0 0 0
    
  • Double negation on the above result and adding 0 gives

    (!!rowSums(data[,1:2]))+0
     #[1] 1 1 1 0 0 0
    
  • Same thing when applied to columns z and p but I add 1 gives

    (!!rowSums(data[,3:4]))+1
    #[1] 1 1 1 2 2 2
    
  • If you add the above two results, will get

    (!!rowSums(data[,1:2])) +0 + (!!rowSums(data[,3:4])+1)
     #[1] 2 2 2 1 1 1
    
  • This can be used as a numeric index so that if I use c("b", "a")[!!rowSums..], the 2 values will be replaced by b and 1 with a.

2
On

This line returns "a" if either x or y is non-zero, and "b" otherwise.

ifelse(data$x | data$y, "a", "b")
# [1] "a" "a" "a" "b" "b" "b"

If you need to handle the case where all four columns are zero, you could use:

ifelse(data$x | data$y,
       "a",
       ifelse(data$z | data$p, "b", "neither a nor b"))