Time Series - Building a Dataset
Overview
Time series data often comes from multiple datasets. Prior to analyzing the data, it is often necessary to combine the data
into a single set.
Building a Single Dataset
Time series data is somewhat unique among the analysis of data in that it often requires a significant amount of
data preparation prior to analysis.
The first challenge is often just building the dataset. Typically this means that the data comes from multiple datasets. As an
example, when building a stock trading model, you may need to measure correlations between the returns of multiple stock prices.
The prices may come as separate datasets, such as the following:
let stock1Prices = [{date:'2020-01-01', price:100}, {date:'2020-01-02':price:102},{date:'2020-01-03':price:101},];
let stock2Prices = [{date:'2020-01-01', price:200}, {date:'2020-01-02':price:199},{date:'2020-01-03':price:201},];
The dataset you wish to construct would look like the following:
let data = [{date:'2020-01-01', stock1:100, stock2:200},{date:'2020-01-02', stock1:102, stock2:199},{date:'2020-01-03', stock1:101, stock2:201},]
Notice that each record has a date, and the price of each stock on that date. This process is known as
joining the data. Information about joining data on the davinci platform can be found
here
The first step in analysis of a dataset is the process of
feature extraction
Prior to analyzing time series data, you often have to
prepare the data first. This can include transforming the data,
preprocessing or cleansing, and differencing.