Changing Labels on Viss_miss

732 Views Asked by At

I'm trying to create a missingness diagram of my data. I've ran the following code:

library(visdat)
library(naniar)

vis_miss(data, sort_miss = TRUE, show_perc = TRUE)

However, the labels are employment.factor or a variation instead of Employment. How do I change this label?

Also, all of my variables in the dataset are included here. How do I select which certain variables are included in the missingness diagram?

1

There are 1 best solutions below

0
On

Rather than changing the variable names after plotted, can you change the variable names to a new subset from the actual data set, THEN plot? Using the dplyr package:

library(dplyr)
data_subset <- select(data, A, B, C)
vis_miss(data_subset)

sort_miss=TRUE arranges the variables by most missingness on the x-axis which you have included vis_miss returns a ggplot object so it is possible to change labels apparently. This github project seems to have provided an example using vis_miss and R's airquality data set: https://github.com/ropensci/visdat/blob/master/R/vis-miss.R

You can get the order of columns with highest missingness:

na_sort <- order(colSums(is.na(data)), decreasing = TRUE)

Then get the names of those columns:

col_order_index <- names(data)[na_sort]

Gather the variables together for plotting (column of the row number, then variable, then contents of that variable)

dat_pre_vis <- as.data.frame(data.na[row_order_index , ])

Have you tried pulling up the help documentation for ?naniar which lists all of its available functions included in the package? There is some explanation about using naniar here: https://cran.r-project.org/web/packages/naniar/vignettes/naniar-visualisation.html