Differencing

Overview


Differencing is a feature extraction technique for time series. It is used primarily to guarantee that the time series being measured is stationary.

Differencing


In order to be able to run and interpret statistics on a dataset, certain assumptions need to be made. Often times, it is assumed that the data is generated from some distribution, where each data point is independent of the others. This assumption is not usually not sensible with time series data, that is, it is hard to believe that one measurement at a given time is independent of the others.

Typically, statisticians look to transform a given time series into one that satisfies the necessary conditions. However, independence is usually difficult to achieve. There is an alternative though. For time series data sets of sufficient size, one can usually get the benefits attributed to independent datasets if the dataset can be assumed to be stationary.

Stationarity is usually achieved by differencing the dataset. Differencing can be achieved rather simply using array methods such as map or by using the $list api.


let data = [{price:100},{price:101},{price:102},{price:101},
			{price:99},{price:102},{price:102},{price:103},];
let differences = data.map((p,i,items)=>{
  let item = {
    ...p,					
  }		
  if(i > 0) item.diff = p.price-items[i-1].price;
  return item;
})
					

Returns


Calculating returns is a standard way to difference a time series. It is usually the best method when the time series cannot dip below zero such as many time series found in finance. (prices)

The exact calculation of a return is something of a matter of choice. The standard arithmetic return is given as follows:
{% return_i = \frac{P_i}{P_{i-1}} %}
For purely theorectical reasons (especially when dealing with continuous time series) a logarithm is often taken, giving the following definition of a return:
{% return_i = ln(\frac{P_i}{P_{i-1}}) %}

Integration


Integration, in a time series context, is the reverse of differencing. That is, instead of calculating differences between consecutive points, it sums the points. Integration is used to recover the origal series from a differenced series.

Tools and Implementation


There are multiple ways to implement differencing on a time series. The library hosts the following libraries :



The Indicators app provides a graphical way for implement differencing.

Contents