Principle Component Analysis
Principle Component Analysis 主成分分析
Intro
PCA is one of the most important and widely used dimension reduction algorithm. Other dimension reduction algorithm include LDA, LLE and Laplacian Eigenmaps.
You can regard PCA as a one layer neural network with nD input and mD output.
PCA reject nD data to mD with the maximum various.
PCA
We want to reduce data from nD to mD.
$$
Y = AX
Y - R^{m1} \ A- R^{mn} \ X-R^{n*1} \
A = \begin{bmatrix} -a_1-\ -a_2- \ … \ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\ a_2(x-\bar{x}) \ … \ a_m(x-\bar{x}) \end{bmatrix} \
\left{\begin{array}{l}
Y = A(X - \bar{X}) \
\bar x = E(X) = \frac1p\sum\limits_{p=1}^Px_p
\end{array}\right. \
\max \sum\limits_{i=1}^P(y_i - \bar y_i)^2 \
\bar y_i = \frac1p\sum\limits_{i=1}^Py_i = \frac {a_i}{p}(\sum\limits_{i=1}^Px_i - p\bar x) = 0 \
\sum\limits_{i=1}^P(y_i - \bar y_i)^2 = \sum\limits_{i=1}^P[a_1(x_i - \bar x)]^2 \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =\sum\limits_{i=1}^Pa_1[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\sum\limits_{i=1}^P[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\Sigma a_1^T
$$
$\Sigma$ – covariance matrix
Our target:
$$
\left{\begin{array}{l}
\max \ \ a_1\Sigma a_1^T\
s.t. \ \ \ \ \ a_1a_1^T = ||a_1||^2 = 1
\end{array}\right.
$$
apply Lagrange multiplier method:
$$
E(a_1) = a_1\Sigma a_1^T - \lambda_1(a_1a_1^T -1) \
\frac{\partial E}{\partial a_1} = (\Sigma a_1^T - \lambda_1a_1^T)^T = 0 \
\Sigma a_1^T = \lambda_1a_1^T
$$
$\lambda_1$ is the largest eigenvalue of $\Sigma$, and $a_1$ is its eigenvector.
$$
\left{\begin{array}{l}
\max \ \ a_2\Sigma a_2^T\
s.t. \ \ \ \ \ a_2a_2^T = ||a_2||^2 = 1
\end{array}\right.
$$
(same method)
$\lambda_2$ is the second largest eigenvalue of $\Sigma$, and $a_2$ is its eigenvector.
Summary
① find the value of $\Sigma$
② find the eigenvalues $a_i$ of $\Sigma$ and sort them
③ normalize $a_i$, let $a_1a_1^T=1$
④ $A = \begin{bmatrix} -a_1-\ -a_2- \ … \ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\ a_2(x-\bar{x}) \ … \ a_m(x-\bar{x}) \end{bmatrix}$
⑤ $Y = A(X - \bar{X})$