I have done a PCA and I want to create a wordcloud that represents the weights and direction of relationship (positive or negative) of the different dimensions of the component of the PCA. I have this code in R
pca_loadings$colorRC1 <- ifelse(pca_loadings[,1] >= 0, 'red', 'blue')
pca_loadings$absRC1 <- abs(pca_loadings[,1])
wordcloud::wordcloud(words = pca_loadings$V5,
freq = pca_loadings$absRC1, scale = c(3,1),
min.freq = 0.3,
max.words = 20,
ordered.colors = T,
colors = pca_loadings$colorRC1,
rot.per = 0)
it kind of works. pca_loadings$V% is where the name of each dimension is stored, absRC1 is the absolute value of the weights for each dimension for component 1. I want the cutoff to be 0.3, so anything smaller is not included in the word cloud.
To make the colour vary depending on whether it is positive or negative I have created a variable called colorRC1 which basically looks at the weights of each dimension, if the number is >0 it writes red in that row, if <0 it writes blue.
So far so good. However, because the minimum threshold is 0.3, whenever the wordcloud function gets to a row with an absolute value smaller than 0.3, it omits it but it does not skip the colour row, so then the absRC1 column and the colour column dont match. basically in the output the size is fine, but because it does not skip a row in the colour column whenever a value is smaller than 0.3, the direction of the numbers (positive or negative) doesnt match the colour they are printed in.
Does anyone know how to solve this?
I don’t have access to your
pca_loadingstable, so I’m creating adata.framethat resembles the features necessary for this example.We can make the modifications for color and absolute values with
dplyr::mutate()and then filter out rows with values below0.3withdplyr::filter().