Gaussian Models
Overview
Guassian models are models where the distributions underlying the model are all Guassian, or normal. This is usually a simplification, but is often good enough
and makes the models tractable and computationally feasible. Guassian models are often a good starting point.
Multivariate Normal Distributions
Multivariate Normal Distirubtions is a distribution over a vector of data, where each marginal distribution is normal.
The formula for the multivariate normal distribution, using matrix notation is
{% N(x|\mu, \Sigma) = 1/ (2 \pi ^{D/2} |\Sigma|^{1/2} ) \times exp [-0.5 \times (x-\mu)^T \Sigma^{-1}(c - \mu)] ) %}
The expression
{% [-0.5 \times (x-\mu)^T \Sigma^{-1}(c - \mu)] %}
is called the
Mahalanobis distance
between the data vector x and the mean vector {% \mu %}
(see
Machine Learning Distance)
Gaussian Discriminant Analysis
The Gaussian Discriminant Analysis assumes that the underlying distribution that is generating the data is a series of multivariate normal distributions.
This means that the distribution generates various classes (or categories) of data, and each category is generated by a
Normal distribution. The trick then becomes determining which category a sample point belongs to.
The usual way to accomplish this when the underlying distributions are normal is to use the maximum likelihood to determine the category
with the highest probability.
{% p(x|y=c,\theta) = N(x|\mu_c , \Sigma_c) %}
Which leads to the nearest centroids classifier,
{% y(x) = argmin _c (x-\mu_c)^T \Sigma^{-1} (x-\mu_c) %}
Implementation
The following basic Implementation uses the
moments library
and the
norms library.
let mt = await import('/lib/statistics/moments/v1.0.0/moments.js');
let nm = await import('/lib/linear-algebra/v1.0.0/norms.mjs');
let mu1 = mt.mean(data1);
let mu2 = mt.mean(data2);
let covar1 = mt.covariance(data1);
let covar2 = mt.covariance(data2);
let testPoint = [160,80]
let distance1 = nm.mahalanobis(testPoint,mu1,covar1);
let distance2 = nm.mahalanobis(testPoint,mu2,covar2);
Try it!
Example Distributions
As an example of guassian discriminant analysis, consider the height and weight data for men versus women. The data displays the characteristic
shape of guassian processes.
copy
copy