Principal Component Analysis (PCA) is often used for dimensionality reduction. PCA finds the linear combination of the variables (dimension) to form the new dimensions that keeps the greatest variables and the dimensions are orthogonal.
Steps:
Interesting behaviour: when you multiply (dot product) the covariance matrix with a data point over and over again, the direction of the data point converges to the direction that has greatest variance.
Eigenvalue & Eigenvector: \[Av = \lambda v\] where \(v\) is the eigenvector and \(\lambda\) is the eigenvalue
Variance of projections: \[\frac{1}{n} \sum_{i=1}^{n} (\sum_{j=1}^d x_{i,j}e_j-\mu)^2 = \frac{1}{n} \sum_{i=1}^{n} (\sum_{j=1}^d x_{i,j}e_j)^2\] Assuming we already de-mean the \(X\), so that \(\mu = 0\)
We want to maximize the variance of the projection subect to that the dimension \(||e||=1\). This becomes a constrains optimization
\[V = \frac{1}{n} \sum_{i=1}^{n} (\sum_{j=1}^d x_{i,j}e_j)^2 - \lambda((\sum_{j=1}^d e_j^2)-1)\]