Principal Components - Least Squares

Overview


Definition


Decomposition

The least squares derivation of principal components starts with the standard inner product Vector Decomposition.

Let {% e_1,e_2,...,e_n %} be an ortho-normal basis for a vector space V. Then any vector {% \vec{v} \in V %} can be written as
{% v = \langle v,e_1 \rangle e_1 + ... + \langle v, e_n \rangle e_n %}
Typically, this decomposition is written as a column vector.



Compression

The information in the vector can be "compressed" by choosing an ortho-normal set of vectors for a vector space V {% e_1,e_2,...,e_k %} such that k < n, and then to express the vector as
{% v' = \langle v,e_1 \rangle e_1 + ... + \langle v, e_k \rangle e_k %}
{% v' \approx x %}
Of course, the new set {% e_1,e_2,...,e_k %} will not form a basis for V, and the above decomposition can only be approximate.



Optimization

In order to optimize the compression, we try to find the set {% e_1,e_2,...,e_k %} that minimizes the following
{% \sum_i |v_i'-v_i|^2 %}
It can be shown that this problem can be solved interatively. That is, solve the problem for k=1, then then solve for k=2 by taking the already calculated first vector, and solve for the second vector. Proceed until you have k vectors.

As Approximation


Principal components can be understood in the context of approximation in normed spaces. In particular, the space K is seen to be the space spanned by the k orthonormal vectors {% e_1,e_2,...,e_k %}.

Example


The graph shows a dataset of points which are 2-dimensional vectors. Each vector can be written as
{% v = \langle v,e_x \rangle e_x + \langle v, e_y \rangle e_y %}
or as a column vector
{% \begin{bmatrix} x \\ y \\ \end{bmatrix} %}


The first principle component is the vector
{% e_1' = \frac{1}{\sqrt{2}}(e_x + e_y) %}


Contents