I want to filter group id's specific conditions meeting on both column and some row value in r

78 Views Asked by At

I have sample data and I want to filter the number of id's never had sup status while on type ==N, meaning I only choose id with status == unsup while before switching type, and then the number id's who switched from N to P.

  • for example id==1 never had status==sup while on type==N, So I need to count id 1. Then I want to check this id also whether switching to P. But id 2 is not eligible for selected because it have sup status while on type==N.

  • id's 2, 5, and id 7 will not be eligible for as they had status == sup, while on status N and id 7 was on NA only while on N.

data <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6,7,7,7),
                   type=c('N','N','N', 'N', 'P','P','N','N','N', 'I', 'I','N','N','N',
 'N', 'N','N','N','N', 'O', 'O','N','N','N', 'O','N','N','N', 'P', 'P', 'N','N','P'), 
status=c(NA,'unsup',NA,'unsup',NA,'sup',NA,NA,'sup',NA,'sup','unsup',NA,'unsup',NA,
'unsup','unsup',NA,'unsup',NA,'sup','sup',NA,NA,'unsup',NA,'unsup','unsup','unsup','sup', NA, NA, 'sup'))

Expected output

1.

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   3    N  unsup
8   3    N   <NA>
9   3    N  unsup
10  3    N   <NA>
11  3    N  unsup
12  4    N  unsup
13  4    N   <NA>
14  4    N  unsup
15  4    O   <NA>
16  4    O    sup
17  6    N   <NA>
18  6    N  unsup
19  6    N  unsup
20  6    P  unsup
21  6    P    sup

Then of which, id's switched to P are:

   id type status
1   1    N   <NA>
2   1    N  unsup
3   1    N   <NA>
4   1    N  unsup
5   1    P   <NA>
6   1    P    sup
7   6    N   <NA>
8   6    N  unsup
9   6    N  unsup
10  6    P  unsup
11  6    P    sup
1

There are 1 best solutions below

0
akrun On BEST ANSWER

For the first case, after grouping by 'id', filter any 'id's not having status value as 'sup' and type as 'N' and those ids having any non-NA value for status where type is 'N'

library(dplyr)
data1 <- data %>% 
  group_by(id) %>%
  filter((!any((status %in% 'sup' & type == 'N'), na.rm = TRUE))& 
      any(!is.na(status[type == "N"]))) %>% 
  ungroup

-output

data1
# A tibble: 21 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     3 N     unsup 
 8     3 N     <NA>  
 9     3 N     unsup 
10     3 N     <NA>  
# … with 11 more rows

From the subset data, we can filter again after checking for any case where there is a type value of 'N' and the next value (lead) is 'P' for each 'id'

data1 %>% 
  group_by(id) %>%
  filter(any(type== "N" & lead(type) == "P", na.rm = TRUE)) %>% 
  ungroup
# A tibble: 11 × 3
      id type  status
   <dbl> <chr> <chr> 
 1     1 N     <NA>  
 2     1 N     unsup 
 3     1 N     <NA>  
 4     1 N     unsup 
 5     1 P     <NA>  
 6     1 P     sup   
 7     6 N     <NA>  
 8     6 N     unsup 
 9     6 N     unsup 
10     6 P     unsup 
11     6 P     sup