Data Cleansing


Prior to any analysis, the data should analyzed for data which is "dirty" in some way, that is, that the data in missing, or the information has been corrupted, that is the recorded value is not representative of the process it was recorded from, typically by the inclusion of some noise.

Data Preparation

  • Outliers - methods for dealing with records that appear not have been generated by the data process being studied
  • Missing - methods for dealing with missing data.
