Principle Component Analysis 主成分分析


Intro

PCA is one of the most important and widely used dimension reduction algorithm. Other dimension reduction algorithm include LDA, LLE and Laplacian Eigenmaps.
You can regard PCA as a one layer neural network with nD input and mD output.
PCA reject nD data to mD with the maximum various.
在这里插入图片描述

PCA

We want to reduce data from nD to mD.
$$
Y = AX
Y - R^{m1} \ A- R^{mn} \ X-R^{n*1} \

A = \begin{bmatrix} -a_1-\ -a_2- \ … \ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\ a_2(x-\bar{x}) \ … \ a_m(x-\bar{x}) \end{bmatrix} \

\left{\begin{array}{l}
Y = A(X - \bar{X}) \
\bar x = E(X) = \frac1p\sum\limits_{p=1}^Px_p
\end{array}\right. \

\max \sum\limits_{i=1}^P(y_i - \bar y_i)^2 \
\bar y_i = \frac1p\sum\limits_{i=1}^Py_i = \frac {a_i}{p}(\sum\limits_{i=1}^Px_i - p\bar x) = 0 \
\sum\limits_{i=1}^P(y_i - \bar y_i)^2 = \sum\limits_{i=1}^P[a_1(x_i - \bar x)]^2 \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =\sum\limits_{i=1}^Pa_1[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\sum\limits_{i=1}^P[(x_i-\bar x)(x_i - \bar x)^T]a_1^T \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ =a_1\Sigma a_1^T
$$

$\Sigma$ – covariance matrix

Our target:
$$
\left{\begin{array}{l}
\max \ \ a_1\Sigma a_1^T\
s.t. \ \ \ \ \ a_1a_1^T = ||a_1||^2 = 1
\end{array}\right.
$$
apply Lagrange multiplier method:
$$
E(a_1) = a_1\Sigma a_1^T - \lambda_1(a_1a_1^T -1) \
\frac{\partial E}{\partial a_1} = (\Sigma a_1^T - \lambda_1a_1^T)^T = 0 \
\Sigma a_1^T = \lambda_1a_1^T
$$
$\lambda_1$ is the largest eigenvalue of $\Sigma$, and $a_1$ is its eigenvector.

$$
\left{\begin{array}{l}
\max \ \ a_2\Sigma a_2^T\
s.t. \ \ \ \ \ a_2a_2^T = ||a_2||^2 = 1
\end{array}\right.
$$

(same method)
$\lambda_2$ is the second largest eigenvalue of $\Sigma$, and $a_2$ is its eigenvector.

Summary

① find the value of $\Sigma$
② find the eigenvalues $a_i$ of $\Sigma$ and sort them
③ normalize $a_i$, let $a_1a_1^T=1$
④ $A = \begin{bmatrix} -a_1-\ -a_2- \ … \ -a_m- \end{bmatrix} Y = \begin{bmatrix} a_1(x-\bar{x})\ a_2(x-\bar{x}) \ … \ a_m(x-\bar{x}) \end{bmatrix}$

⑤ $Y = A(X - \bar{X})$