Resampling (Bootstrap)
Overview
The Bootstrap method was introduced by Efron in 1979. It was formulated in order to provide a relatively easy way to
estimate a standard error (or other population parameters) when other methods are either too complex or not accurate.
(see
statistical inference)
Procedure
From a given sample
{% S = {x_1,x_2,...,x_n} %}
- Draw a random sample of size n from S
- Estimate the parameter in question
- Repeat steps 1 and 2 and save the results
- Estimate a statistic from the set of parameter estimates
Example Standard Error of the Population Mean
- Draw a random sample of size n from S
- Calculate the average of the newly drawn sample
- Repeat steps 1 and 2 and save the results
- Calculate the standard deviation of the list of averages
Formal Description
{% S = {x_1,x_2,...,x_n} %}
where each {% x_i %} is drawn from the same distribution function {% F %}.
The goal of bootstrap is to estimate a distribution of a given statistic
{% T(F) = T(x_1,x_2,...,x_n) %}
That is, we wish to
produce a distribution such as
{% \mathbb{P}_F(T(F) \leq x) %}
The bootstrap procedure solves the problem by
finding an distribution, {% \hat{F} %} that is close to the distribution
{% F %} and then to calculate
{% \mathbb{P}_\hat{F}(T(\hat{F}) \leq x) %}
such that it is known that the estimate converges to the true value as n increases.
{% sup | \mathbb{P}_F(T(F)\leq x) - \mathbb{P}_{\hat{F}}(T(\hat{F}) \leq x) | \rightarrow 0 %}
Demos and Tutorials