Filtering a xts object from one year to the next

84 Views Asked by At

If I have a 10-year daily xts time series, starting on 1985-01-01 and ending on 1994-12-31, how can I calculate the sum of values for the interval starting on Nov 1 and ending on Mar 31 of the next year, through out the series?

dates <- seq(from = as.Date("1985-01-01"), to = as.Date("1994-12-31"), by = 1)
data <- rnorm(length(dates))
my.ts <- xts(x=data, order.by=dates)

I need to get

my.ts2[1] = sum(my.ts[1985-11-01 to 1986-03-31])
my.ts2[2] = sum(my.ts[1986-11-01 to 1987-03-31])
my.ts2[3] = sum(my.ts[1987-11-01 to 1988-03-31])
and so on

Note: There are no Feb-29 in the original time series.

2

There are 2 best solutions below

6
AkselA On BEST ANSWER

This aggregates by successive groups, separated at Nov 1. and Apr 1. The first row of the results is an incomplete "in" group, then an "out" group, then an "in", and so on.

library(xts)

set.seed(1)
dates <- seq(from = as.Date("1985-01-01"), to = as.Date("1994-12-31"), by = 1)
data <- round(rnorm(length(dates)), 2)
my.ts <- xts(x=data, order.by=dates)

d <- as.POSIXlt(index(my.ts))
nov1 <- d$mon == 10 & d$mday == 1
apr1 <- d$mon == 3 & d$mday == 1
id <- cumsum(nov1+apr1)

agg <- cbind(
  aggregate(index(my.ts), list(id), function(x) as.character(range(x))),
  aggregate(my.ts, id, sum)
)

agg <- data.frame(
  id=agg[[1]], 
  start=agg[[2]][,1], 
  end=agg[[2]][,2], 
  sum=agg[[3]]
)

# To filter for only the "in" groups
agg[endsWith(agg$end, "-03-31") | endsWith(agg$start, "-11-01"),]
#    id      start        end    sum
# 1   0 1985-01-01 1985-03-31   9.77
# 3   2 1985-11-01 1986-03-31  -5.76
# 5   4 1986-11-01 1987-03-31 -10.54
# 7   6 1987-11-01 1988-03-31  -5.81
# 9   8 1988-11-01 1989-03-31   9.26
# 11 10 1989-11-01 1990-03-31  -6.42
# 13 12 1990-11-01 1991-03-31  -5.78
# 15 14 1991-11-01 1992-03-31 -17.57
# 17 16 1992-11-01 1993-03-31  20.03
# 19 18 1993-11-01 1994-03-31  16.06
# 21 20 1994-11-01 1994-12-31  -5.76
0
G. Grothendieck On

Here is a two liner to do that with the reproducible input in the Note at the end.

Let ok be TRUE for the dates that are within November to March and FALSE otherwise. Then use collapse::groupid (or data.table::rleid or dplyr::consecutive_id) to uniquely identify each run and use [ok] to drop the dates outside of the aforementioned intervals. Use ave to identify the last date in each November to March interval and sum over each interval using aggregate.zoo giving zoo series res. (If desired use as.xts(res) or fortity.zoo(res) to convert that to an xts object or data frame.)

library(xts)
library(collapse) # groupid

ok <- cycle(as.yearmon(time(my.ts))) %in% c(11:12, 1:3)
res <- aggregate(my.ts[ok], ave(time(my.ts[ok]), groupid(ok)[ok], FUN = max), sum)
res

giving

1985-03-31   8.4324478
1986-03-31   9.4125674
1987-03-31  17.7676264
1988-03-31 -16.3924959
1989-03-31 -22.1016964
1990-03-31 -11.3095770
1991-03-31   2.8736790
1992-03-31  -8.0894314
1993-03-31   6.2141083
1994-03-31   2.8035838
1994-12-31   0.9928396

Note

We used this input with set.seed to make it reproducible.

set.seed(123)
dates <- seq(from = as.Date("1985-01-01"), to = as.Date("1994-12-31"), by = 1)
data <- rnorm(length(dates))
my.ts <- xts(x=data, order.by=dates)