in Principle Component Analysis
I was wondering why the data projected onto the principle component has variance by the eigenvalue corresponding to the principle eigenvector?
I can't find the explanation in my textbook.
in Principle Component Analysis
I was wondering why the data projected onto the principle component has variance by the eigenvalue corresponding to the principle eigenvector?
I can't find the explanation in my textbook.
What you're doing in principle component analysis is "diagonalizing the covariance matrix", and that in the coordinate basis that diagonalizes the covariance, you can just read off the variance of each component.
To really understand it requires learning the linear algebra that underlies the eigenvalue problem; things like "the eigenvalues of a Hermitian matrix are invariant under orthogonal transformations" and so on, but something you could try is:
x
-values as zero-mean Gaussians with variance sigma_x2
y
values as zero-mean Gaussians with variance sigma_y2<sigma_x2
.x,y
) is the corresponding element of the covariance matrix. Also note that the two
eigenvalues of this matrix are sigma_x2,sigma_x1
and the eigenvectors are [1,0]
and [0,1]
.O
, and generate a rotated version of each [x,y]
sample. You'll find that the correlation matrix of this transformed data set has
off-diagonal elements, i.e. a correlation between x
and y
. But if you do the eigenvalue decomposition, the eigenvectors are just the columns of the orthogonal matrix
used to rotate the data in the first place, and the eigenvalues are the original eigenvalues.Principle components analysis, i.e. the eigenvalue decomposition of the covariance matrix, is running this process in reverse: starting with the correlated data set, and then deriving the coordinate basis that diagonalizes the covariance matrix.
Getting your head around it will probably take both learning the formal mathematics and some experience, maybe trying it out (and visualizing it) on 2 or 3 dimensional problems will help you to get a feel for it.
Good question. Please read CMU's 36350 lecture notes. In short, the way the PCA optimization problem is framed leads to a Lagrangian constraint optimization eigenproblem (pg. 2-5) that is solved by taking the eigenvectors of the sample covariance matrix.
In Principal Components Analysis (PCA), you are calculating a rotation of the original coordinate system such that all non-diagonal elements of the new covariance matrix become zero (i.e., the new coordinates are uncorrelated). The eigenvectors define the directions of the new coordinate axes and the eigenvalues correspond to the diagonal elements of the new covariance matrix (the variance along the new axes). So the eigenvalues, by definition, define the variance along the corresponding eigenvectors.
Note that if you were to multiply all your original data values by some constant (with value greater than one), that would have the effect of increasing the variance (and covariance) of the data. If you then perform PCA on the modified data, the eigenvectors you compute would be the same (you still need to same rotation to uncorrelate your coordinates) but the eigenvalues would increase because the variance of the data along the new coordinate axes will have increased.