Ordinary Least Squares Regression

Overview

An ordinary least squares regression is the process of fitting a linear equation to a set of points, using the squared error as the function to be minimized. It is a widely used tool in statistics and is easy to compute because it has a closed form analytic solution. For more information about regressions in general, please see regression.

Statement

Given a dataset{% {(y_1,\vec{x}_1),(y_2,\vec{x}_2),...(y_n,\vec{x}_n) } %} a regression hypothesizes a relationship of the form
{% y_i = \alpha + \sum_{i=1}^n \beta_i x_i + \epsilon %}
or stated in matrix terms
{% \vec{y} = X \vec{\beta} + \epsilon %}
Here, {% X %} is the matrix formed by choosing the rows to be equal to the {% \vec{x} %} values of each pair.

The coefficients are chosen in order to minimize the squared error, defined as
{% \sum (y_i - \sum_{i=1}^n \beta_i x_i)^2 %}
That is, ordinary least squares chooses the squared error as its Loss function.
Stated in matrix terms, the least squares procedure minimizes
{% (\vec{y} - \textbf{X} \vec{B})^T(\vec{y} - \textbf{X} \vec{B}) %}

Regression Tools