Kernel Methods

Overview

The concept of a kernel is used in various ways within mathematics and machine learning. As such, there are varying definitions (not always compatible) for what a kernel is. Within mathematics, the word kernel is overloaded to mean different things depending on the context.

Kernel Definition

A simple intuitive definition of a kernel is a function
{% k(x,y) : E \times E \rightarrow \mathbb{R} %}
which represents the similarity between the two inputted elements. That is, a higher value represents a higher similarity. (the opposite of a metric)

For a list of kernel functions, see example kernels.
A more rigorous description follows.

Given a set {% X %}, a function {% k %} is called a kernel if there exists a Hilbert space {% H %} and a map
{% \phi:X \rightarrow H %}
such that
{% k(x,x') = \langle \phi(x),\phi(x') \rangle %}
The map {% \phi %} is called a feature map (see feature extraction). It maps the original dataset into a new space that has additional features. (features in the original space are typically referred to as attributes)