Simulate discordant effects on multiple responses in R

35 Views Asked by At

I would like to simulate the following in R. Let X be a random variable that takes values {0, 1, 2} and Y,Z two random variables of any continuous distribution:

  1. How to generate X, Y and Z so that the Pearson correlation between Y and Z is very high (e.g. r = 0.8), while their respective correlation with X is very different?.

  2. In other words, which are the X, Y and Z that minimize cor(X,Y) and cor(X,Z), given that cor(Y,Z) = r, with r relatively large?

  3. how to generate not only two (Y, Z) but k variables (Y_1, Y_2, ..., Y_k) that fulfill the previous (they have a correlation matrix with non-diagonal elements = r, with r very high), but they have very different correlations with X).?

1

There are 1 best solutions below

0
On

1-2. Y~N(0,1), Z~N(Y,0), X = [Z<Y] + [Y<0], [.] = Iverson bracket In R:

Y <- rnorm(100)
Z <- rnorm(100, Y)
X = I(Z<Y) + I(Y<0)
cbind(Y, Z ,X) %>% 
    cor()

       Y          Z          X
Y  1.0000000  0.7545677 -0.6593067
Z  0.7545677  1.0000000 -0.8240605
X -0.6593067 -0.8240605  1.0000000

I'll leave 3. as an exercise.