Survival Analysis - Factors
Overview
Oftentimes the probability of an event ocurring is dependent on factors other in age. When this is true, the analysis becomes more
complicated.
As an example, consider modeling the age of death of person based on whether the individual smokes or not. If an individual does not smoke,
then it is reasonably straightforward to calculate a mortality table, howeer, there is no guarantee that the individual will remain a
non-smoker throughout their life. This means that
Modeling the Hazard Rate
The simplest way to address the dilemma outlined above is to model the hazard rate of the process. That is, to model the probability of the
event happening within a small time interval given that it has not happend up to that point. In this case, the underlying factors
can be taken to be constant. The most common way to fit the data to a set of factors is through the use of
logistic regression.
(NOTE, in this case, age is generally considered to be just another factor in the mix)
Modeling the Survival Function
If the survival function itself is required, then the modeling becomes more complex than the case of fitting the hazard function. The underlying
factors can change over time, and this needs to be accounted for.
In the simplest case, one may assume that the factors are constant and then just
integrate
the estimated hazard rate to get the survival function.
When this assumption is unrealistic, one needs to build a model of the factors themselves and how they change over time. Once this is done,
a simple way to get the survival function is to simulate the evolution of underlying factors and then to simiulate the ocurrence of the event
given each factor realization. The results can be aggregated to arrive at an answer.