Winsorization

Overview


Winsorization is a process for dealing with outliers from a series of numbers. First it computes the mean and standard deviation of the series. Then, it looks for any number that resides outside a specified number of standard deviations. Any number outside the range is replaced by the number of standard deviations specified.

The winsorization algorithm can be run multiplie times. That is, after the outliers are moved in, there will be a new mean and standard deviation for the new set of numbers.

Implementation


The following code demostrates performing a winsorization using only the moments api.


let mt = await import ('/lib/statistics/moments/v1.0.0/moments.mjs');
let data =[{price:100},{price:102},{price:134},{price:101},{price:102},{price:102},{price:103},{price:100},{price:100},{price:102}];
let data1 = data.map(p=>p.price);

let average = mt.average(data1);
let stdDev = mt.stdDev(data1);

let dataset = data1.map(p=>{
  if(p > average + 2*stdDev) p = average + 2*stdDev;
  else if(p < average - 2*stdDev) p = average - 2*stdDev;
  return {
    price:p
  };
});

$console.log('answer is '+ JSON.stringify(dataset));
					
Try it!

Winsorization API


The above winsorization is encapsulated in the winsorization library.


let wn = await import ('/lib/statistics/winsorization/v1.0.0/winsorization.mjs');
let data =[{price:100},{price:102},{price:134},{price:101},{price:102},{price:102},{price:103},{price:100},{price:100},{price:102}];

let answer = wn.winsorize(data.map(p=>p.price), 2);
					
Try it!

Contents