Differencing
Overview
Differencing is a
feature extraction
technique for time series. It is used primarily to guarantee that the
time series being measured is
stationary.
Differencing
In order to be able to run and interpret statistics on a dataset, certain assumptions need to be made. Often times, it is assumed that the data
is generated from some distribution, where each data point is independent of the others. This assumption is not usually not sensible with
time series data, that is, it is hard to believe that one measurement at a given time is independent of the others.
Typically, statisticians look to transform a given time series into one that satisfies the necessary conditions. However, independence is usually
difficult to achieve. There is an alternative though. For time series data sets of sufficient size, one can usually get the benefits attributed to independent
datasets if the dataset can be assumed to be stationary.
Stationarity is usually achieved by differencing the dataset. Differencing can be achieved rather simply using
array methods such as map or by using
the
$list api.
let data = [{price:100},{price:101},{price:102},{price:101},
{price:99},{price:102},{price:102},{price:103},];
let differences = data.map((p,i,items)=>{
let item = {
...p,
}
if(i > 0) item.diff = p.price-items[i-1].price;
return item;
})
Returns
Calculating returns is a standard way to difference a time series. It is usually the best method when the time series
cannot dip below zero such as many time series found in finance. (prices)
The exact calculation of a return is something of a matter of choice. The standard arithmetic return is given as follows:
{% return_i = \frac{P_i}{P_{i-1}} %}
For purely theorectical reasons (especially when dealing with continuous time series) a logarithm is often taken, giving the following
definition of a return:
{% return_i = ln(\frac{P_i}{P_{i-1}}) %}
Integration
Integration, in a time series context, is the reverse of differencing. That is, instead of calculating differences
between consecutive points, it sums the points. Integration is used to recover the origal series from a differenced series.
Tools and Implementation
There are multiple ways to implement differencing on a time series. The library hosts the following libraries :
The
Indicators app
provides a graphical way for implement differencing.