Ordinary Least Squares Regression

Overview


An ordinary least squares regression is the process of fitting a linear equation to a set of points, using the squared error as the function to be minimized. It is a widely used tool in statistics and is easy to compute because it has a closed form analytic solution. For more information about regressions in general, please see regression.

Statement


Given a dataset{% {(y_1,\vec{x}_1),(y_2,\vec{x}_2),...(y_n,\vec{x}_n) } %} a regression hypothesizes a relationship of the form
{% y_i = \alpha + \sum_{i=1}^n \beta_i x_i + \epsilon %}
or stated in matrix terms
{% \vec{y} = X \vec{\beta} + \epsilon %}
Here, {% X %} is the matrix formed by choosing the rows to be equal to the {% \vec{x} %} values of each pair.

The coefficients are chosen in order to minimize the squared error, defined as
{% \sum (y_i - \sum_{i=1}^n \beta_i x_i)^2 %}
That is, ordinary least squares chooses the squared error as its Loss function.
Stated in matrix terms, the least squares procedure minimizes
{% (\vec{y} - \textbf{X} \vec{B})^T(\vec{y} - \textbf{X} \vec{B}) %}

Topics


  • Matematical Details
    • Residuals
    • Derivation of the optimal coefficients.
    • Classical Assumptions
    • Large Sample Assumptions
    • Gauss Markov Theorem (BLUE)
  • Inference
    • Normal Models - Statistics and Hypothesis Testing
    • Bootstrap
  • Feature Extraction and Kernel Regression

Additional Topics


  • Geometric Interpretation
  • Weighted Regression
  • Regularization
    • Ridge Regression and Regularization
    • L1 Regularization
  • Regression in Time Series
  • Model Selection
  • Regression Statistics through Re-Sampling

Regression Tools


  • Regression Library
  • Regression App
  • R Language
  • Tensor Flow - shows how to run a linear regression using tensor flow

User Tracks


The following examples demonstrate using Ordinary Least Squares Regression
Linear Regression
The linear regression is a standard workhorse of statisticians and data scientists.
start