Bayesianism

Overview


Bayesianism is an approach to statistics and machine learning that uses Bayes Rule to help estimate unknown model parameters.

In standard (frequentist) statistics, model parameters are generally fixed, even if unknown. In a Bayes model, model parameters ({% \vec{\theta} %}) are themselves random variables.

Given a sampled dataset D, Bayes rule implies
{% \mathbb{P}(\vec{\theta}|D) = \frac{\mathbb{P} (D|\vec{\theta}) \mathbb{P}(\vec{\theta})}{\sum_{i=1}^n \mathbb{P}(D|\vec{\theta}_i) \mathbb{P}(\vec{\theta}_i)} %}
That is, if it is known how the probability of the given sampled dataset {% D %} relates to a set of given parameters {% \vec{\theta}_i %}, and it is known the prior probability {% \mathbb{P}(\vec{\theta}_i) %}, then the probability of any given set of parameters is known.

Example


In a standard OLS regression, the data is assumed to follow the equation
{% y_i = \alpha + \sum_{i=1}^n \beta_i x_i + \epsilon %}
In this model, the parameters {% \beta_i %} are assumed to be fixed, and the regression is undertaken to estimate the most likely values of those parameters.

In a Bayesian interpretation, the {% \beta_i %} are assumed to be random variables themselves. The analyst will have a prior distribution that specifies the likelihood of various values of {% \beta_i %} based on the analysts assumptions or prior experience. The new dataset will help the analyst update here prior distribution to account for the likelihood of the observed dataset.