Estimation
Overview
Estimation is the process of deteriming a point estimate of a distribution parameter.
Typically the parameters of interest are
moments
of the distribution, but any number that helps to specify the shape of the distribution in question
can be an estimated paramter.
Example - Distribution Mean
As a simple example, an analyst may wish to determine the mean, {% \mu %} of a distribution.
She will follow a process of algorithm to compute a number, {% \hat{\mu} %} which is in some
sense close to the theoretical parameter. (In this case, the typical procedure is to compute
the average of the samples.)
{% \hat{\mu} = \frac{1}{n} \sum_i X_i %}
Estimator Bias
Bias - refers to the situation where the expected value of an estimated parameter
is not equal to the parameter. When the expectation of of the parameter equals the parameter (in theory),
the estimate is said to be unbiased.
In the example above, the estimate is theoretically unbiased.
{% \mathbb{E}[\hat{\mu}] = \frac{1}{n} \sum_i \mathbb{E} [X_i] = \mathbb{E}[X_i] %}
Example - Distribution Variance
When estimating the variance of distribution, one may be tempted to calculate the following
estimate.
{% \hat{\sigma^2} = \frac{1}{n} \sum_i (X_i - \hat{\mu})^2 %}
When using the estimate of mean given above, this formula produces a biased estimate of the variance.
The correct formula to get an unbiased estimate is given by:
{% \hat{\sigma^2} = \frac{1}{n-1} \sum_i (X_i - \hat{\mu})^2 %}
Bias Variance Tradeoff
{% \mathbb{E}[(\hat{\theta} - \theta)^2] = Variance(\hat{\theta}) + Bias(\hat{\theta})^2 %}
{% \mathbb{E}[(\hat{\theta} - \theta)^2] %}
{% = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}] + \mathbb{E}[\hat{\theta}] - \theta)^2] %}
{% = \mathbb{E} [(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2 + (\mathbb{E}[\hat{\theta}] - \theta)^2 +2(\hat{\theta} - \mathbb{E}[\hat{\theta}])(\mathbb{E}[\hat{\theta}] - \theta) ] %}
{% = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2] + \mathbb{E}[(\mathbb{E}[\hat{\theta}]-\theta)^2] + \mathbb{E}[2(\hat{\theta} - \mathbb{E}[\hat{\theta}])(\mathbb{E}[\hat{\theta}-\theta])] %}
The first term is the variance
{% Variance(\hat{\theta}) = \mathbb{E}[(\hat{\theta} - \mathbb{E}[\hat{\theta}])^2] %}
The second term is the bias squared
{% Bias(\hat{\theta})^2 = \mathbb{E}[(\mathbb{E}[\hat{\theta}]-\theta)^2] = (\mathbb{E}[\hat{\theta}]-\theta)^2 %}
The third term is equal to zero
Consistency
An estimate is said to be consistent if it converges to the true value
{% \hat{\theta}_n \rightarrow \theta %}
where the convergence is taken to be
convergence in probability
Maximum Likelihood Estimate
Maximum Likelihood