I'm looking to replicate this correlation plot, or at least get as close as possible to it.

Specifically, I want:

  • the correlation values in the lower half, with values varying on a greyscale based on absolute value
  • the circles in the top half, with varying diameter and on the colour scale.
  • I want to be able to edit the axis scale labels so that full descriptions are on the y-axis, and numeric references on the x-axis

I have gotten relatively close, but have not managed precise enough replication. I describe my closest attempts below with reproducible code. The corrplot package has gotten me closest.

    # general preparation
    library(car)
    correlations = cor(mtcars)
  1. corrplot package
    library(corrplot)
    
    corrplot.mixed(correlations, 
                   upper = "number", #upper.col = ???
                   lower = "circle", #lower.col = ???
                   tl.pos = "lt", tl.col = "black", tl.cex = 0.5)

Notes:

  1. cormat package
    source("http://www.sthda.com/upload/rquery_cormat.r")

    rquery.cormat(mtcar)
  1. ggcorrplot
    library("ggcorrplot")
    
    # circles separate
    ggcorrplot(correlations, # correlation matrix
               method = "circle", # circles instead of squares
               type = "upper", # show only upped triangle
               show.diag = F, # don't show diagonal values (1)
               lab = F, # don't show cor coeffs
               outline.col = "white", # no outline of circles
               ggtheme = theme_bw, # theme
               colors = c("#440154FF","#238A8DFF","#FDE725FF"))
    
    # coefs separate
    ggcorrplot(correlations, # correlation matrix
               method = "circle", # circles instead of squares
               type = "upper", # show only upped triangle
               show.diag = F, # don't show diagonal values (1)
               lab = T, # don't show cor coeffs
               outline.col = NA, # don't show circles
               ggtheme = theme_bw, # theme
               colors = c("#440154FF","#238A8DFF","#FDE725FF"))
    
    # can't combine both plots?
  1. corrgram package
    library(corrgram)
    
    corrgram(correlations,
             labels = indices_all,
             lower.panel = "panel.fill",
             upper.panel = "panel.cor")

Some other notes:

  • It seems the halves of the plots tend to run via the opposite diagonal than in the example plot, but I guess that's not a big concern.
1

There are 1 best solutions below

5
On BEST ANSWER

Out-of-the-box options are quick and nice. However, when it comes to customizing then IMHO it may be worthwhile to build up the plot from scratch using ggplot2. As a first step this involves some data wrangling to get you correlation matrix into the right shape. Also in this step I convert the categories to factors and a numeric id. Based on the ids I split the data in the upper and lower diagonal values which could then be plotted separately using a geom_point and a geom_text. Besides that it's important to add the drop=FALSE to the x and y scale to keep all factor levels and the right order. Also I use some functions to get the desired axis labels:

EDIT: Following the suggestion by @AllanCameron I added a coord_equal as the "final" touch to get a nice square matrix like look. And Thanks to @RichtieSacramento the code now maps the absolute value on the size aes.

library(dplyr)
library(tidyr)
library(ggplot2)

correlations = cor(mtcars)

levels <- colnames(mtcars)

corr_long <- correlations %>%
  data.frame() %>% 
  mutate(row = factor(rownames(.), levels = levels), 
         rowid = as.numeric(row)) %>%
  pivot_longer(-c(row, rowid), names_to = "col") %>%
  mutate(col = factor(col, levels = levels),
         colid = as.numeric(col))

ggplot(corr_long, aes(col, row)) +
  geom_point(aes(size = abs(value), fill = value), 
             data = ~filter(.x, rowid > colid), shape = 21) +
  geom_text(aes(label = scales::number(value, accuracy = .01), color = abs(value)), 
            data = ~filter(.x, rowid < colid), size = 8 / .pt) +
  scale_x_discrete(labels = ~ attr(.x, "pos"), drop = FALSE) +
  scale_y_discrete(labels = ~ paste0(.x, " (", attr(.x, "pos"), ")"), drop = FALSE) +
  scale_fill_viridis_c(limits = c(-1, 1)) +
  scale_color_gradient(low = grey(.8), high = grey(.2)) +
  coord_equal() +
  guides(size = "none", color = "none") +
  theme(legend.position = "bottom", 
        panel.grid = element_blank(), 
        axis.ticks = element_blank()) +
  labs(x = NULL, y = NULL, fill = NULL)

enter image description here