cumsum by participant and reset on 0 R

312 Views Asked by At

I have a data frame that looks like this below. I need to sum the number of correct trials by participant, and reset the counter when it gets to a 0.

Participant TrialNumber Correct 
      118           1       1     
      118           2       1     
      118           3       1     
      118           4       1     
      118           5       1     
      118           6       1     
      118           7       1     
      118           8       0     
      118           9       1     
      118          10       1     
      120           1       1     
      120           2       1     
      120           3       1     
      120           4       1     
      120           5       0     
      120           6       1     
      120           7       0     
      120           8       1     
      120           9       1     
      120          10       1     

I've tried using splitstackshape:

df$Count <- getanID(cbind(df$Participant, cumsum(df$Correct)))[,.id]

But it cumulatively sums the correct trials when it gets to a 0 and not by participant:

Participant TrialNumber Correct Count
      118           1       1     1
      118           2       1     1
      118           3       1     1
      118           4       1     1
      118           5       1     1
      118           6       1     1
      118           7       1     1
      118           8       0     2
      118           9       1     1
      118          10       1     1
      120           1       1     1
      120           2       1     1
      120           3       1     1
      120           4       1     1
      120           5       0     2
      120           6       1     1
      120           7       0     2
      120           8       1     1
      120           9       1     1
      120          10       1     1

I then tried using dplyr:

df %>% 
  group_by(Participant) %>%
  mutate(Count=cumsum(Correct)) %>%
  ungroup %>% 
  as.data.frame(df)
Participant TrialNumber Correct Count
      118           1       1     1
      118           2       1     2
      118           3       1     3
      118           4       1     4
      118           5       1     5
      118           6       1     6
      118           7       1     7
      118           8       0     7
      118           9       1     8
      118          10       1     9
      120           1       1     1
      120           2       1     2
      120           3       1     3
      120           4       1     4
      120           5       0     4
      120           6       1     5
      120           7       0     5
      120           8       1     6
      120           9       1     7
      120          10       1     8

Which gets me closer, but still doesn't reset the counter when it gets to 0. If anyone has any suggestions to do this it would be greatly appreciated, thank you

1

There are 1 best solutions below

0
On BEST ANSWER

Does this work?

library(dplyr)
library(data.table)
df %>% 
  mutate(grp = rleid(Correct)) %>%
  group_by(Participant, grp) %>%
  mutate(Count = cumsum(Correct)) %>%
  select(- grp)
# A tibble: 10 x 4
# Groups:   Participant, grp [6]
     grp Participant Correct Count
   <int> <chr>         <dbl> <dbl>
 1     1 A                 1     1
 2     1 A                 1     2
 3     1 A                 1     3
 4     2 A                 0     0
 5     3 A                 1     1
 6     3 B                 1     1
 7     3 B                 1     2
 8     4 B                 0     0
 9     5 B                 1     1
10     5 B                 1     2

Toy data:

df <- data.frame(
  Participant = c(rep("A", 5), rep("B", 5)),
  Correct = c(1,1,1,0,1,1,1,0,1,1)
)