Reversing cumulative sum to determine actuals in R

1.9k Views Asked by At

I have a df where each row is the cumulative sum of the row above it. Is there a way to derive the original values from this df?

   X1 X2
1  1  5
2  3  9
3  6 12
4 10 14
5 15 15

Desired output:

   X1 X2
1  1  5
2  2  4
3  3  3
4  4  2
5  5  1

Thanks

2

There are 2 best solutions below

0
On BEST ANSWER

Just use diff. Assuming your dataset is called "mydf" and you want to do this for all columns, try:

mydf[] <- lapply(mydf, function(x) diff(c(0, x)))
mydf
#   X1 X2
# 1  1  5
# 2  2  4
# 3  3  3
# 4  4  2
# 5  5  1

Since diff returns a vector of length one less than the input, you need to pad the input with a 0 (thus also retaining the original value in that column).


As @DavidArenburg mentions, you could also easily adapt this to "data.table" code too, like this:

library(data.table)
as.data.table(mydf)[, lapply(.SD, function(x) diff(c(0, x)))]
2
On

Using dplyr

 library(dplyr)
 df %>% 
    mutate_each(funs(.-c(0,lag(.)[-1])))
 #  X1 X2
 #1  1  5
 #2  2  4
 #3  3  3
 #4  4  2
 #5  5  1

Or as @BrennanBeal mentioned in the comments, in the recent versions of dplyr

df %>% 
    mutate_at(vars(X1:X2), ~ .-c(0,lag(.)[-1]))

Or from dplyr 1.0.0

df %>%
   mutate(across(X1:X2, ~ .-c(0,lag(.)[-1])))