Model Evaluation

Overview


Model evaluation refers to the process of making an estimate of the effectiveness of a trained machine learning algorithm. In standard statistics, evaluation is often accomplished by assuming a statistical process is generated by an assumed distribution and then a hypothesis test is performed in order to assess the goodness of fit of the fitted parameters.

In machine learning, this is often no done for a couple of reasons:

  • A model is constructed that does not have an analytical distribution
  • The modeler does not wish to make any assumptions about the underlying distribution

Training Set


One way to test the effectiveness of an algorithm is to split the dataset into a training dataset (a set of data used to train the algorithm) and a test set. The test set is used to compute an error for the trained algorithm after it has been trained. It is important that the test set not include any data from the set that is used to train the algorithm.

Javascript provides the slice method on arrays that will select a subarray by starting index and ending index.


let data = [1,2,3,4,5,6,7,8,9];
let first = data.slice(0,2)
let back = data.slice(2,data.length)
				


If the orginal dataset is not randomized, you can randomize the array before splitting but using the resampling library. Please see resampling page for more info.


let rs = await import('/lib/statistics/resampling/v1.0.0/resampling.js')
let data = [1,2,3,4,5,6,7,8,9];
let random = rs.randomize(data);
				
Try it!

Cross Validation


Cross Validation is used when the dataset is not very large and you cant afford to lose any data points in the learning process. Cross Validation splits the dataset in k equal sized datasets (or nearly equal). Then the model is trained k times. Each time, one of the k sets is used as the test set, and the other sets are used to train the model.

The following algorithm shows splitting a dataset into different sets of length equal to size.


let size = 10;
let sets = [];
let set = [];
for(let i=0;i<data.length;i++){
  if(i%size == 0){
    set = [];
    sets.push(set);
  }
  set.push(data[i]);
}
				


An extreme version of this algorithm is to create test sets of size 1. That is, the model is trained on all the points of the dataset except one and then tested on the one removed point. This is done for each point in the dataset.

Performance Measures


Contents