Kernel Regression

Overview


The kernel method is one way to take a linear regression and turn it into a method that can utilize a nonlinear relationship among the selected features, by first mapping the features into a new feature space.

The technique is similar to other kernel methods. By first recasting the regression methodology in a format that utilizes an embedded kernel function, the regression can be run in using a different kernel function.

Ridge Regression


The datapoints that the regression is built on are assumed to be vectors
{% \vec{x} \in \mathbb{R}^n %}
The feature extraction map is denoted {% \phi %}.

What you would seek to do, would be to recast the the regression formula so that the values of the sample set only show up as part of inner product, such as
{% \kappa(\vec{x_i}, \vec{x_j}) = \; \langle \phi(\vec{x_i}), \phi(\vec{x_j}) \rangle %}
The standard OLS regression seeks to minimize the following expression
{% min_\vec{w} \; (\lambda ||\vec{w}||^2 + | y - \langle \vec{w} , \vec {x} \rangle |^2 ) %}
If you differentiate the cost function with respect to the vector, {% \vec{w} %} and set to zero, you get
{% X^TX\vec{w} + \lambda \vec{w} = (X^TX + \lambda I_n) \vec{w} = X^Ty %}
which implies that
{% \vec{w} = \lambda ^{-1} X^T(y - X\vec{w}) = X^T\alpha %}
where we define
{% \alpha = \lambda ^{-1} (\vec{y} - X\vec{w}) %}
{% \lambda \alpha = (\vec{y} - XX^T \alpha) %}
{% (XX^T + \lambda I_n) \alpha = y %}
{% \alpha = (XX^T + \lambda I_n)^{-1} y %}
where
{% (XX^T) _{ij} = \; \langle x_i , x_j \rangle %}
which we will refer to as the Gramm matrix, and represent as {% G %}.

The prediction for a new datapoint would be
{% g(\vec{x}) = \langle \vec{w} , \vec{x} \rangle = \sum_1 ^n \alpha _i \langle \vec{x}_i , \vec{x} \rangle = y^T(G + \alpha I_i) ^{-1} \vec{k} %}
where
{% \vec{k}_i = \langle \vec{x}_i , \vec{x} \rangle %}

kernel ridge regression


In a the case of
{% | y - \langle \vec{w} ,\vec{\phi}(\vec{x}) \rangle | %}
Then the Gram matrix becomes
{% G _{ij} = \; \langle \phi(x_i) , \phi(x_j) \rangle %}
and
{% \vec{k}_i = \langle \phi(\vec{x}_i) , \phi( \vec{x} ) \rangle %}

let rg = await import('/lib/machine-learning/kernel/v1.0.0/regression.js');
					
Try it!