Statistical Considerations of Factor Choice
Overview
One of the key considerations when choosing a set of factors in a risk model is that the factor can be shown
to be correlated to some degree with the assets returns. This is not simply a matter of measuring the correlation
between the variables. Random noise in the data can make a sampled dataset have a correlation that is not present
in the underlying distribution.
The typical way to handle this is to apply the methods of
Hypothesis Testing.
Hypothesis Tests
Factor models are usually expressed as a linear relationship.
{% Portfolio \, Return = \alpha + \beta_1 X_1 + ... + \beta_n X_n %}
When stated this way, the model can be fit using
OLS linear regression.
Hypothesis testing with OLS regression is a well developed field.
Hypothesis Test Complications
With financial models, there are complications. In particular
- The hypothesis test assumes that the samples are i.i.d. (independent and identically distributed) Both of these assumptions
are questionable in the case of financial assets, and can only be at best assumed to be approximate
- The t-stat and p-stat used in regression hypthosis testing assumes that the regression residuals
are
normally
distributed, which is known to be only approximate