Cumsum by group and deal with NAs

1.3k Views Asked by At

I have data in long format which looks like:

ID<-c("A","A","B","B","C","C","D","D")
CW<-c(1,2,1,2,1,2,1,2)
Value<-c(1,1,3,3,NA,NA,6,6)
dt<-cbind(ID,CW,Value)

I want to calculate cumsum() for dt$Value, the result should look like:

cum<-c(1,1,4,4,NA,NA,10,10)
dt.cum<-cbind(dt,cum)

I formatted dt as datatable and tried something like:

dt[, cum := cumsum(ifelse(is.na(dt$Value), 0, dt$Value)) +dt$Value*0, by="CW"]

However, this did not work.

1

There are 1 best solutions below

0
On

First of all, don't use cbind to create a data frame. cbind returns a matrix, and, in this case, all its elements are coerced to class character.

dt <- data.frame(ID,CW,Value)

Now, here is a base R solution. Function aveis meant for this kind of problem.

dt$cum <- ave(dt$Value, dt$CW, FUN = function(x) cumsum(ifelse(is.na(x), 0, x)))
is.na(dt$cum) <- is.na(dt$Value)
dt$cum
[1]  1  1  4  4 NA NA 10 10