OLS Regression and L1 Regularization

Overview


The simplest example of a sparse model is typically considered to be OLS Regression with L1 Regularization.

When a standard OLS regresion is run against a set of features, {% X_1, X_2,...,X_m %}, the regression will likely produce a non zero coefficient for each factor incldued in the regression.

L1 Regularization


The L1 Regularized OLS Regression is given by choosing a loss function
{% f(\vec{w}) = \sum_i^n (y_i - (w_0 + \textbf{w}^T \textbf{x}_i))^2 + \lambda || \textbf{w} ||_1 %}
where
{% || \textbf{w} ||_1 = \sum |w_i| %}
The L1 regularized regression is known to produce sparse solutions, where a number of the features to the regression are assigned a zero coefficient. The number of features that are assigned a zero coefficient increases as {% \lambda %} is increased.