R - Identify correlation chains with decreasing correlation coefficients

43 Views Asked by At

I'm going to identify all chains of items that fulfill a few conditions:

  • The correlation coefficients between every item in a chain should be positive and significant (p value < corrected α, let's say corrected α = 0.01).
  • The correlation coefficients of every pair of items in the chain should decrease with the numbers of items inbetween both items, i.e. if we're looking at the i-th and (i+k)-th item (with k being the distance between the items in the chain, either to the right or left side), the correlation coefficient should be smaller than between the i-th and (i+(k-1))-th item and greater than between the i-th and (i+(k+1))-th item.
  • An item can occur at any position inside the chain (regardless of the order in the original data set) and should only occur once inside the chain.
  • I'm only interested in the longest chains, i.e. chains that are a part of another, longer chain (maybe with more nodes in-between the items) should be removed.

My first thought to identify such "correlation chains" was to test all possible permutations of lengths from 3 up to n (number of items in the dataset). However, I doubt that this exhaustive search will be the most efficient way to identify correlation chains. Maybe building up possible chains from scratch might be a better way. Nevertheless, I'm still a bit lost on how I can do it in an efficient way in R. Thus, I'd be very honored if you could suggest a way!

Here's some example data set we could use:

require(Hmisc)

z <- rcorr(as.matrix(mtcars))

z$r  # correlation coefficients
z$P  # correlation tests' p valuess

I use the function rcorr from the Hmisc package to calculate matrices of correlation coefficients and p values. Many thanks in advance for your suggestions!!!

0

There are 0 best solutions below