Fitting Data in Survival Analysis

Survival Analysis - Factors

Overview

Oftentimes the probability of an event ocurring is dependent on factors other in age. When this is true, the analysis becomes more complicated.

As an example, consider modeling the age of death of person based on whether the individual smokes or not. If an individual does not smoke, then it is reasonably straightforward to calculate a mortality table, howeer, there is no guarantee that the individual will remain a non-smoker throughout their life. This means that

Modeling the Hazard Rate

The simplest way to address the dilemma outlined above is to model the hazard rate of the process. That is, to model the probability of the event happening within a small time interval given that it has not happend up to that point. In this case, the underlying factors can be taken to be constant. The most common way to fit the data to a set of factors is through the use of logistic regression. (NOTE, in this case, age is generally considered to be just another factor in the mix)

Modeling the Survival Function

If the survival function itself is required, then the modeling becomes more complex than the case of fitting the hazard function. The underlying factors can change over time, and this needs to be accounted for.

In the simplest case, one may assume that the factors are constant and then just integrate the estimated hazard rate to get the survival function.

When this assumption is unrealistic, one needs to build a model of the factors themselves and how they change over time. Once this is done, a simple way to get the survival function is to simulate the evolution of underlying factors and then to simiulate the ocurrence of the event given each factor realization. The results can be aggregated to arrive at an answer.

Overview

Modeling the Hazard Rate

Modeling the Survival Function

Contents