Time Series - Cleansing a Dataset

Overview


Cleansing a dataset that represents a time series can be more challenging than the process used to cleanse other datasets, where the data is assumed to be i.i.d. (independent identically distributed) For example, a record in an i.i.d. dataset that has been corrupted somehow can simply be removed. In a time series, this will cause complications.

As an example, when the dataset represents asset prices, you cannot simply remove one record and then proceed to difference the data continue with the analysis. (That is, differencing between non-adjacent records will introduce additional noise)

Topics


  • Missing Values
  • Processing Outliers