Stationarity

Overview


Stationarity is a concept that plays a big role in time series analysis. The observations of a time series are not all independently generated. This causes a problem when applying statistical methods based on the concept of random sampling. Many of the tools of statistical inference are dependent on the samples drawn having been i.i.d. That is, independent and drawn from identical distributions.

The concept of stationarity corrects many of these problems, by guaranteeing that the normal statistics are valid as the number of sample points gets large.

Stationarity


A time series is stationaryif the distribution function of a set of points in the series is independent of when the series starts.
{% F(x_i, x_{i+1}, ...., x_{i+n}) = F(x_j, x_{j+1}, ...., x_{j+n}) %}
X is weakly stationary if {% \mathbb{E}(x_t) %} is independent of t. {% Cov(x_{t+h}, x_t) %} is independent of t. a process that is stationary is by definition weakly stationary.

Stationarity and Statistical Inference


The importance of stationarity in a time series is its relationship to statistical inference.

The workhorse of statistical inference is the independent and identically distibuted sampling assumption. That is, when presented with a dataset, (say {% {x_1,x_2, ... , x_n} %}), the analyst assumes that the data was drawn from a population in such a manner that each draw is independent of the other draws, and that each draw is taken from the same distribution. (Often, this means random sampling from a population with replacement).

An estimate of the average or expected value of a draw can then be calculated as
{% Average = \frac{1}{n} \sum x_i %}
Then we know that the variance of the average is
{% Variance = \frac{1}{n} \sum Var(x_i) %}
that is, the measurement gets more accurate (in probability) as the number of samples increases.

When a time series is not stationary, it poses several problems for statistical inference. First, if the mean is not independent of t, then the estimate above cannot be used. Second, points in a time series are rarely independent.

When using ordinary least squares resgression in a time series, the strict exogeneity assumption is often violated. (see regression in a time series)

Intuition


The typical solution to the statistical inference problem is to transform the time series in question into one that stationary. The stationary time series has moments that are independent of time so that an estimate can be formed. In addition, a stationary process is one where the relevance of the value of one data point to another decreases with the time distance between them.

Consider a time series such as the following:
{% x_{t+1} = \delta x_t + \epsilon_t %}
if {% \delta < 1 %}, then the impact of {% x_t %} on {% x_{t+T} %} decreases with as T grows large. That is
{% \mathbb{E}(x_{t+T}|x_t) = \delta^T \mathbb{E}(x_t) \rightarrow 0 %}
That is, if a time series is stationary and large, the points begin to look more independent. In particular, the tools of ordinary least squares regression can be used due to the large sample properties.

Stochastic Trends


The presence of non-stationarity (stochastic-trends) can invalidate regression hypothesis testing statistics. The following shows 2 randomly generated series, with each either stationary or not, and the results of a regression on the values.

When both x and y are non-stationarity, the p-values alomst always show a significant relationship, even though each series is randomly generated. When only one is stationarity, no relationship is Typically detected. (However, it should be noted that the proof of the p-value significance uses stationarity of all variables, so even though sample regression seems to indicate that one can still use the statistics, it is not justified with a full mathematical proof and should be used with care.)

Test for Stationarity


  • Auto-Correlation - a simple way to visualally determine whether a time series is stationary
  • Unit Root Tests - a formal method of testing for stationarity.

Topics


Contents