Kernel Density Estimation

Overview


Kernel density estimators are a non parametric way to estimate a function from a sample set. In theory, it is just an extension of the nearest neighbor algorithm, but using kernels to average over the sample dataset.

That is, assume we have a set of data points
{% D = \{ (x_1,y_1), ..., (x_ny_n) \} %}
Then, we create a function to approximate the function that generated the points. Following the logic of the nearest neighbors algorithm, we calculate a weighted average of the y values in the dataset, where the weight is computed using the kernel function.
{% f(x) = \frac{\sum_{i=1}^N k(x,x_i)y_i}{\sum_{j=1}^N k(x,x_j)} %}


kernel density estimation - example1



let dn = await import('/lib/machine-learning/kernel/v1.0.0/density.mjs');
let kl = await import('/lib/machine-learning/kernel/v1.0.0/kernel.mjs');

let data = $from(1,10,100).map(p=>{
  return {
    x:p,
    y:Math.sin(p)+0.1*Math.random()
  }
});

kernel = kl.epanechnikov(0.5);
let estimator = dn.estimator(data.map(p=>[p.x,p.y]), kernel);

let test = estimator(5);
					
Try it!


The following shows the estimator of the noisy sine curve example above.

copy

Contents