How to calculate rolling correlation between rows in an xts?

1.2k Views Asked by At

I have an xts of yearly data. I am trying to get the rank correlation between each year. For example, this is a subset of my xts:

> yearlyRanks[16:20,45:55]
           35881 35880 42261 33445 46087 31486 8981 7687 8203 8202 41383
2009-12-31     8     9    19     8    18    18   16    4   16   16    20
2010-12-31     4     3    20     6    19     2   17   17   17   17    21
2011-12-31     3     4    21     3    20     1   18   18   18   18    22
2012-12-31     6     6    22     5    21    19   19   19   19   19     4
2013-12-31     7     7     3     4    22    20   20   20   20   20     2

I would like to know the correlation between the ranks in each year with the preceding year. (Trying to tell how well this year's rank was predicted by last year's.)

I am trying to use this:

yearlyCors <- rollapplyr(coredata(yearlyRanks), width = 2, function(x) cor(x[1], x[2], use = 'n'))

But it takes FOREVER, and it doesn't seem to work. I think it is because I am passing it a set of 2 rows, so it wants to return 2 values, but I am only expecting 1. (Does that make sense?)

Any ideas on how I would do this?

EDIT:

Just to be clear, this is what i would want from that subset:

> test <- yearlyRanks[16:20,45:55]
> c(cor(test[1,], test[2,]), cor(test[2,], test[3,]), cor(test[3,], test[4,]), cor(test[4,], test[5,]))
[1] 0.4679246 0.9930253 0.4854528 0.7193598

EDIT:

What I want is the diag() + 1 of the correlation matrix. Here is the correlation matrix (of the transpose):

> cor(t(test))
           2009-12-31 2010-12-31 2011-12-31 2012-12-31 2013-12-31
2009-12-31 1.00000000  *0.4679246*  0.4716995  0.3722922 0.08786426
2010-12-31 0.46792463  1.0000000  *0.9930253*  0.4654688 0.17192856
2011-12-31 0.47169948  0.9930253  1.0000000  *0.4854528* 0.20237689
2012-12-31 0.37229225  0.4654688  0.4854528  1.0000000 *0.71935975*
2013-12-31 0.08786426  0.1719286  0.2023769  0.7193598 1.00000000

You can see the starred values are the ones I want. Is there a way to access the diag + 1 (if you follow)?

3

There are 3 best solutions below

0
On BEST ANSWER

This is one way you could get your desired result:

data <- "35881 35880 42261 33445 46087 31486 8981 7687 8203 8202 41383
2009-12-31     8     9    19     8    18    18   16    4   16   16    20
2010-12-31     4     3    20     6    19     2   17   17   17   17    21
2011-12-31     3     4    21     3    20     1   18   18   18   18    22
2012-12-31     6     6    22     5    21    19   19   19   19   19     4
2013-12-31     7     7     3     4    22    20   20   20   20   20     2"
dat <- read.table(text = data)
yearlyRanks <- xts(dat, order.by = as.POSIXct(row.names(dat)))

m_yearlyRanks <- t(coredata(yearlyRanks))
unlist(lapply(1:(NCOL(m_yearlyRanks) -1), function(i, x) cor(x[,i], x[, i + 1]), x = m_yearlyRanks))
# > unlist(lapply(1:(NCOL(m_yearlyRanks) -1), function(i, x) cor(x[,i], x[, i + 1]), x = m_yearlyRanks))
# [1] 0.4679246 0.9930253 0.4854528 0.7193598

That last line of code might be a bit tricky to read. It could be expressed more verbosely as (the result is identical):

res <- vector("numeric", length = NCOL(m_yearlyRanks) -1)
for (i in 1:(NCOL(m_yearlyRanks) -1)) {
    res[i] <- cor(m_yearlyRanks[,i], m_yearlyRanks[, i + 1])
}
# > res
# [1] 0.4679246 0.9930253 0.4854528 0.7193598

Your error in this code:

yearlyCors <- rollapplyr(coredata(yearlyRanks), width = 2, function(x) cor(x[1], x[2], use = 'n'))

arises from x returning one column of data (a numeric vector) to which x[1] and x[2] are elements 1 and 2 of x, which are then passed into cor. cor is expecting two vectors of data but it's getting 2 scalars each time the roll function is called. Try debugging the function with browser and it will become immediately obvious to you what the problem is. e.g. try calling:

yearlyCors <- rollapplyr(coredata(GS), width = 20, function(x) {
    browser()
    cor(x[1], x[2], use = 'n')
    }
    )
0
On

Use by.column=FALSE and be sure the function refers to the rows:

cor2 <- function(x) cor(x[1,], x[2,])
rollapplyr(coredata(yearlyRanks), 2, cor2, by.column = FALSE)
## [1] 0.4679246 0.9930253 0.4854528 0.7193598

We could also do this:

z <- rollapplyr(as.zoo(yearlyRanks), 2, cor2, by.column = FALSE)
as.xts(z)

giving:

                [,1]
2010-12-31 0.4679246
2011-12-31 0.9930253
2012-12-31 0.4854528
2013-12-31 0.7193598
0
On

I think I figured it out. I just took the first column off the correlation matrix of the transpose, and then took the diag:

> test <- yearlyRanks[16:20,45:55]
> tester <- cor(t(test), use = 'p')
> tester
           2009-12-31 2010-12-31 2011-12-31 2012-12-31 2013-12-31
2009-12-31  1.0000000  0.6309825  0.6167215  0.7106686  0.6076932
2010-12-31  0.6309825  1.0000000  0.9799418  0.4088352  0.2449624
2011-12-31  0.6167215  0.9799418  1.0000000  0.3973902  0.2471984
2012-12-31  0.7106686  0.4088352  0.3973902  1.0000000  0.7315524
2013-12-31  0.6076932  0.2449624  0.2471984  0.7315524  1.0000000
> xts(diag(tester[,-1]), order.by = as.Date(rownames(test))[-1])
                [,1]
2010-12-31 0.6309825
2011-12-31 0.9799418
2012-12-31 0.3973902
2013-12-31 0.7315524

However, I do not believe this is the bast way to do this, as it seems like it might be inefficient. I am calculating a BUNCH of correlations I do not need. It is plenty quick, but if anyone wants to post a more efficient solution, please do!

(Apologies the values changed. I had done something wrong before, but no bother! You all should get the gist!)