## Do the eigenvectors represent the original features?

3

I've got a test dataset with 4 features and the PCA produces a set of 4 eigenvectors, e.g.,

EigenVectors: [0.7549043055910286, 0.24177972266822534, -0.6095588015369825, -0.01000612689310429]
EigenVectors: [0.0363767549959317, -0.9435613299702559, -0.3290509434298886, -0.009706951562064631]
EigenVectors: [-0.001031816289317291, 0.004364438034564146, 0.016866154627905586, -0.999847698334029]
EigenVectors: [-0.654824523403971, 0.2263084929291885, -0.7210264051508555, -0.010499173877772439]


Do the eigenvector values represent the features from the original dataset? E.g., is feature 1 & 2 explaining most of the variance in eigenvector 1?

Am I interpreting the results correct to say that features 1 and 2 are therefore the most important in my dataset since PC1 represents 90% of the variance?

I'm trying to map back to the original features but am unsure how to interpret the results.

2

The principal components (eigenvectors) correspond to the direction (in the original n-dimensional space) with the greatest variance in the data.

The corresponding eigenvalue is a number that indicates how much variance there is in the data along that eigenvector (or principal component).

Thus, feature 2 is the most important (based on eigenvalue alone). Then feature 1. The other 2 features have little impact and theoretically could be removed as a part of your data reduction effort.

Also, it is important to point out that

When performing PCA, it is typically a good idea to normalize the data first. Because PCA seeks to identify the principal components with the highest variance, if the data are not properly normalized, attributes with large values and large variances (in absolute terms) will end up dominating the first principal component when they should not.

IOW, if you didn't normalize your data then your PCA analysis is quite likely meaningless.

*The above quoted text is from http://www.lauradhamilton.com/introduction-to-principal-component-analysis-pca