Principal Components - Least Squares
Overview
Definition
Decomposition
The least squares derivation of principal components starts with the standard inner product
Vector Decomposition.
Let {% e_1,e_2,...,e_n %} be an ortho-normal basis for a vector space V. Then any vector {% \vec{v} \in V %} can be written as
{% v = \langle v,e_1 \rangle e_1 + ... + \langle v, e_n \rangle e_n %}
Typically, this decomposition is written as a
column vector.
Compression
The information in the vector can be "compressed" by choosing an ortho-normal set of vectors
for a vector space V {% e_1,e_2,...,e_k %} such that k < n, and then to express
the vector as
{% v' = \langle v,e_1 \rangle e_1 + ... + \langle v, e_k \rangle e_k %}
{% v' \approx x %}
Of course, the new set {% e_1,e_2,...,e_k %} will not form a basis for V, and the above decomposition
can only be approximate.
Optimization
In order to optimize the compression, we try to find the set {% e_1,e_2,...,e_k %}
that minimizes the following
{% \sum_i |v_i'-v_i|^2 %}
It can be shown that this problem can be solved interatively. That is, solve the problem for k=1, then
then solve for k=2 by taking the already calculated first vector, and solve for the second vector.
Proceed until you have k vectors.
As Approximation
Principal components can be understood in the context of
approximation in normed spaces.
In particular, the space K is seen to be the space spanned by the k orthonormal vectors {% e_1,e_2,...,e_k %}.
Example
The graph shows a dataset of points which are 2-dimensional vectors. Each vector can be written as
{% v = \langle v,e_x \rangle e_x + \langle v, e_y \rangle e_y %}
or as a column vector
{%
\begin{bmatrix}
x \\
y \\
\end{bmatrix}
%}
The first principle component is the vector
{% e_1' = \frac{1}{\sqrt{2}}(e_x + e_y) %}