ggcorrplot, multiple columns, grouped by factor

501 Views Asked by At

My data looks the following way

data = data.frame(
Group = sapply(1:5, function(x) paste0("G", x)),
Y = runif(100, min=1, max=3),
X1 = rnorm(100, 100, 10),
X2 = rnorm(100, 20, 8)
)

What I want is to plot the correlations with ggcorrplot between (X1, Y), (X2, Y) for the groups 1:5 i.e the correlation plot should look like this

        (X1,Y)   (X2, Y)
     G1
     G2
     G3
     G4
     G5

what I have tried:

correlate <- data %>%
group_by(Group) %>% 
summarise(r = cor(cbind(X1,X2), Y))


 #Groups:   Group [5]
  Group   r[,1]
  <chr>   <dbl>
  1 G1    -0.472 
  2 G1    -0.0375
  3 G2     0.0385
  4 G2     0.0248
  5 G3     0.0232
  6 G3     0.507 
  7 G4    -0.180 
  8 G4    -0.452 
  9 G5    -0.264 
 10 G5     0.297 

I get the values, but I have to know myself that the first is for (X1, Y) and the second (X2, Y) isn't there some nice way to do it automatically? and how could I plot those values with ggcorrplot?

1

There are 1 best solutions below

1
On

Provided I understood you correctly, how about the following (making use of patchwork::wrap_plots)

library(ggcorrplot)
library(patchwork)
library(tidyverse)

df <- data %>%
    group_by(Group) %>%
    nest() %>%
    mutate(plot = map2(data, Group, ~ggcorrplot(.x) + ggtitle(.y)))

df$plot %>%
    wrap_plots(ncol = 1)

enter image description here


Sample data

set.seed(2020)
data = data.frame(
    Group = sapply(1:5, function(x) paste0("G", x)),
    Y = runif(100, min=1, max=3),
    X1 = rnorm(100, 100, 10),
    X2 = rnorm(100, 20, 8)
)