I have performed PCA Analysis using the prcomp function apart of the FactoMineR package on quite a substantial dataset of 3000 x 500.
I have tried plotting the main Principal Components that cover up to 100% of cumulative variance proportion with a fviz_eig plot. However, this is a very large plot due to the large dimensions of the dataset. Is there any way in R to split a plot into multiple plots using a for loop or any other way?
Here is a visual of my plot that only cover 80% variance due to the fact it being large. Could I split this plot into 2 plots?
I have tried splitting the plot up using a for loop...
for(i in data[1:20]) {
fviz_eig(data, addlabels = TRUE, ylim = c(0, 30))
}
But this doesn't work.
Edited Reproducible example:
This is only a small reproducible example using an already available dataset in R but I used a similar method for my large dataset. It will show you how the plot actually works.
# Already existing data in R.
install.packages("boot")
library(boot)
data(frets)
frets
dataset_pca <- prcomp(frets)
dataset_pca$x
fviz_eig(dataset_pca, addlabels = TRUE, ylim = c(0, 100))
However, my large dataset has a lot more PCs that this one (possibly 100 or more to cover up to 100% of cumulative variance proportion) and therefore this is why I would like a way to split the single plot into multiple plots for better visualisation.
Update:
I have performed what was said by @G5W below...
data <- prcomp(data, scale = TRUE, center = TRUE)
POEV = data$sdev^2 / sum(data$sdev^2)
barplot(POEV, ylim=c(0,0.22))
lines(0.7+(0:10)*1.2, POEV, type="b", pch=20)
text(0.7+(0:10)*1.2, POEV, labels = round(100*POEV, 1), pos=3)
barplot(POEV[1:40], ylim=c(0,0.22), main="PCs 1 - 40")
text(0.7+(0:6)*1.2, POEV[1:40], labels = round(100*POEV[1:40], 1),
pos=3)
and I have now got a graph as follows...
But I am finding it difficult getting the labels to appear above each bar. Can someone help or suggest something for this please?
I am not 100% sure what you want as your result, but I am 100% sure that you need to take more control over what is being plotted, i.e. do more of it yourself. So let me show an example of doing that. The
frets
data that you used has only 4 dimensions so it is hard to illustrate what to do with more dimensions, so I will instead use thenuclear
data - also available in theboot
package. I am going to start by reproducing the type of graph that you displayed and then altering it.The basic plot of a
prcomp
object is similar to thefviz_eig
plot that you displayed but has three main differences. First, it is showing the actual variances - not the percent of variance explained. Second, it does not contain the line that connects the tops of the bars. Third, it does not have the text labels that tell the heights of the boxes.Percent of Variance Explained. The return from
prcomp
contains the raw information.str(N_PCA)
shows that it has the standard deviations, not the variances - and we want the proportion of total variation. So we just create that and plot it.This addresses the first difference from the
fviz_eig
plot. Regarding the line, you can easily add that if you feel you need it, but I recommend against it. What does that line tell you that you can't already see from the barplot? If you are concerned about too much clutter obscuring the information, get rid of the line. But just in case, you really want it, you can add the line withHowever, I will leave it out as I just view it as clutter.
Finally, you can add the text with
This is also somewhat redundant, but particularly if you change scales (as I am about to do), it could be helpful for making comparisons.
OK, now that we have the substance of your original graph, it is easy to separate it into several parts. For my data, the first two bars are big so the rest are hard to see. In fact, PC's 5-11 show up as zero. Let's separate out the first 4 and then the rest.
Now we can see that even though PC 5 is much smaller that any of 1-4, it is a good bit bigger than 6-11.
I don't know what you want to show with your data, but if you can find an appropriate way to group your components, you can zoom in on whichever PCs you want.