Ordinary Least Squares Regression

Overview

An ordinary least squares regression is the process of fitting a linear equation to a set of points, using the squared error as the function to be minimized. It is a widely used tool in statistics and is easy to compute because it has a closed form analytic solution. For more information about regressions in general, please see regression.

Statement

Given a dataset{% {(y_1,\vec{x}_1),(y_2,\vec{x}_2),...(y_n,\vec{x}_n) } %} a regression hypothesizes a relationship of the form

{% y_i = \alpha + \sum_{i=1}^n \beta_i x_i + \epsilon %}

or stated in matrix terms

{% \vec{y} = X \vec{\beta} + \epsilon %}

Here, {% X %} is the matrix formed by choosing the rows to be equal to the {% \vec{x} %} values of each pair.

The coefficients are chosen in order to minimize the squared error, defined as

{% \sum (y_i - \sum_{i=1}^n \beta_i x_i)^2 %}

That is, ordinary least squares chooses the squared error as its Loss function.

Stated in matrix terms, the least squares procedure minimizes

{% (\vec{y} - \textbf{X} \vec{B})^T(\vec{y} - \textbf{X} \vec{B}) %}

Topics

Matematical Details
- Residuals
- Derivation of the optimal coefficients.
- Classical Assumptions
- Large Sample Assumptions
- Gauss Markov Theorem (BLUE)
Inference
- Normal Models - Statistics and Hypothesis Testing
- Bootstrap
Feature Extraction and Kernel Regression

Additional Topics

Regression Tools

Regression Library
Regression App
R Language
Tensor Flow - shows how to run a linear regression using tensor flow