PCA : Can I reverse the axis of the first principal component in R?

2k Views Asked by At

Here is a reproducible example :

set.seed(10)
pick <- sample(nrow(iris),nrow(iris)/2)
iris.training <- iris[pick,] 
iris.testing <- iris[-pick,]
pca.training <- prcomp(iris.training[-5])
pca.testing <- prcomp(iris.testing[-5])
autoplot(pca.training,loadings.label=T,loadings=T)
autoplot(pca.testing,loadings.label=T,loadings=T)

Which produces the following output : biplot_pca_training biplot_pca_testing

As one can see, pca on data.training and on data.testing produces very similar biplots but the first principal components has reversed its sign, they are mirrored. Is it possible to force a 180 degree rotation on the two components ?

2

There are 2 best solutions below

3
On BEST ANSWER

You are not returning the rotated variables. Changed code is as below. Notice retx=TRUE

set.seed(10)
pick <- sample(nrow(iris),nrow(iris)/2)
iris.training <- iris[pick,] 
iris.testing <- iris[-pick,]
pca.training <- prcomp(iris.training[-5], retx=TRUE)
pca.testing <- prcomp(iris.testing[-5], retx=TRUE)
autoplot(pca.training,loadings.label=TRUE,loadings=TRUE)
autoplot(pca.testing,loadings.label=TRUE,loadings=TRUE)

It produced the following outputs for training and testing. Training PCA plot Testing PCA plot

1
On

I'm assuming autoplot is the function from the ggfortify package. There are probably two ways to do this. The easiest is to just ask to reverse the x axis, by writing

autoplot(pca.testing,loadings.label=TRUE,loadings=TRUE) + scale_x_reverse()

screenshot

Notice that this didn't change any values: the X axis now runs from positive to negative instead of the usual direction.

The second is to modify the pca.testing object to swap the signs on the x axis. This is statistically valid: PCA doesn't determine the signs of any components, but it's a bit tricky, because the signs show up in two places: component x for the data points, and component rotation for the arrows:

pca.testing$x[,1] <- - pca.testing$x[,1]
pca.testing$rotation[,1] <- -pca.testing$rotation[,1]
autoplot(pca.testing,loadings.label=TRUE,loadings=TRUE)

screenshot

Not related to your question, but some advice: don't use T, use TRUE, otherwise the next time you have temperature data, you may inadvertantly change the value, and cause havoc with your analysis.