statistical inference

Estimation

Overview

Estimation is the process of deteriming a point estimate of a distribution parameter. Typically the parameters of interest are moments of the distribution, but any number that helps to specify the shape of the distribution in question can be an estimated paramter.

Example - Distribution Mean

As a simple example, an analyst may wish to determine the mean, {% \mu %} of a distribution. She will follow a process of algorithm to compute a number, {% \hat{\mu} %} which is in some sense close to the theoretical parameter. (In this case, the typical procedure is to compute the average of the samples.)

{% \hat{\mu} = \frac{1}{n} \sum_i X_i %}

Estimator Bias

Bias - refers to the situation where the expected value of an estimated parameter is not equal to the parameter. When the expectation of of the parameter equals the parameter (in theory), the estimate is said to be unbiased.

In the example above, the estimate is theoretically unbiased.

{% \mathbb{E}[\hat{\mu}] = \frac{1}{n} \sum_i \mathbb{E} [X_i] = \mathbb{E}[X_i] %}

Example - Distribution Variance

When estimating the variance of distribution, one may be tempted to calculate the following estimate.

{% \hat{\sigma^2} = \frac{1}{n} \sum_i (X_i - \hat{\mu})^2 %}

When using the estimate of mean given above, this formula produces a biased estimate of the variance. The correct formula to get an unbiased estimate is given by:

{% \hat{\sigma^2} = \frac{1}{n-1} \sum_i (X_i - \hat{\mu})^2 %}

Bias Variance Tradeoff

{% \mathbb{E}[(\hat{\theta} - \theta)^2] = Variance(\hat{\theta}) + Bias(\hat{\theta})^2 %}

{% \mathbb{E}[(\hat{\theta} - \theta)^2] %}
{% = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}] + \mathbb{E}[\hat{\theta}] - \theta)^2] %}
{% = \mathbb{E} [(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2 + (\mathbb{E}[\hat{\theta}] - \theta)^2 +2(\hat{\theta} - \mathbb{E}[\hat{\theta}])(\mathbb{E}[\hat{\theta}] - \theta) ] %}
{% = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2] + \mathbb{E}[(\mathbb{E}[\hat{\theta}]-\theta)^2] + \mathbb{E}[2(\hat{\theta} - \mathbb{E}[\hat{\theta}])(\mathbb{E}[\hat{\theta}-\theta])] %}

The first term is the variance

{% Variance(\hat{\theta}) = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2] %}

The second term is the bias squared

{% Bias(\hat{\theta})^2 = \mathbb{E}[(\mathbb{E}[\hat{\theta}]-\theta)^2] = (\mathbb{E}[\hat{\theta}]-\theta)^2 %}

The third term is equal to zero

Consistency

An estimate is said to be consistent if it converges to the true value

{% \hat{\theta}_n \rightarrow \theta %}

where the convergence is taken to be convergence in probability

Maximum Likelihood Estimate

Maximum Likelihood