Overview
Ada Boost
The Ada Boost algorithm is a form of ensemble learning where an inducer (machine learning model) uses a set of weights for each data point
to influence how much influence it gives to that point while learning. Initially all points are weighted equally. Then after the first
inducer is trained, the points where the inducer guesses incorrectly are weighted higher than the ones it gets correct.
A simple way to create an equal weight array is to use the map function on the array of datapoints.
let rs = await import('/lib/statistics/resampling/v1.0.0/resampling.js')
let data = [1,2,3,4,5,6,7,8,9];
let weights = data.map(p=>1/data.length);
ensemble modeling resources
Creating new Weights
Once we have trained an inducer, for any point the new weight going forward is:
- {% current\:weight \times exp[ -1/2 \times ln(1- total \: error / total \: error)] %}
for a correct guess
- {% current\:weight \times exp[ 1/2 \times ln(1- total \: error / total \: error)] %}
for an incorrect guess
Weighted Error Function
One way to use weights in training the next inducer is to use a weighted error function. A common example of this is using
the weihgted gini function for the impurity measure in a tree inducer.
Weighted Resampling
The other way to influence the new inducer, as opposed to using a weighted error function, is to resample the original data using
the new weights.
Code for resampling from a dataset can be found on the
resampling page.