Gaussian Models
Guassian models are models where the distributions underlying the model are all Guassian, or normal. This is usually a simplification, but is often good enough
and makes the models tractable and computationally feasible. Guassian models are often a good starting point.
Multivariate Normal Distributions
Multivariate Normal Distirubtions is a distribution over a vector of data, where each marginal distribution is normal.
The formula for the multivariate normal distribution, using matrix notation is
{% N(x|\mu, \Sigma) = 1/ (2 \pi ^{D/2} |\Sigma|^{1/2} ) \times exp [-0.5 \times (x-\mu)^T \Sigma^{-1}(c - \mu)] ) %}
The expression
{% [-0.5 \times (x-\mu)^T \Sigma^{-1}(c - \mu)] %}
is called the
Mahalanobis distance
between the data vector x and the mean vector {% \mu %}
Machine Learning Distance)
Gaussian Discriminant Analysis
The Gaussian Discriminant Analysis assumes that the underlying distribution that is generating the data is a series of multivariate normal distributions.
This means that the distribution generates various classes (or categories) of data, and each category is generated by a
Normal distribution. The trick then becomes determining which category a sample point belongs to.
The usual way to accomplish this when the underlying distributions are normal is to use the maximum likelihood to determine the category
with the highest probability.
{% p(x|y=c,\theta) = N(x|\mu_c , \Sigma_c) %}
Which leads to the nearest centroids classifier,
{% y(x) = argmin _c (x-\mu_c)^T \Sigma^{-1} (x-\mu_c) %}
The following basic Implementation uses the
moments library
and the
norms library.
let mt = await import('/lib/statistics/moments/v1.0.0/moments.js');
let nm = await import('/lib/linear-algebra/v1.0.0/norms.mjs');
let mu1 = mt.mean(data1);
let mu2 = mt.mean(data2);
let covar1 = mt.covariance(data1);
let covar2 = mt.covariance(data2);
let testPoint = [160,80]
let distance1 = nm.mahalanobis(testPoint,mu1,covar1);
let distance2 = nm.mahalanobis(testPoint,mu2,covar2);
Try it!
Example Distributions
As an example of guassian discriminant analysis, consider the height and weight data for men versus women. The data displays the characteristic
shape of guassian processes.