Causality

Overview


Causality is the concept that one event may "cause" another event. It is different from statistical correlation, where two events, or variables, are correlated. Correlations are sometimes accidental. Often, correlated variables are only related through their relation to a third variable.

Consider a study that indicates patients that take a vaccine are less likely to get a particular disease. While, it may be the case that the vaccine caused the patient to not get the disease, it may also be that the individuals who take vaccines are more likely to lead healthy lifestyles, or to take other medications, which may be the reasons that they were less likely to get the disease.

Causation modeling seeks to untangle when two variables are related through causation, or when they are merely correlated.

Definitions


Despite being a seemingly fundamental concept, defining causality has turned out to be a difficult task. At its core, causality seems to ecompass the following:

  • A notion of simplification - in truth, a given event is probably influenced by many factors, many of which are either assumed or ignored when one says that A causes B
  • A notion of dependence - for example, A does not cause B even if they are correlated, if there is some variable C such that A and B are statistically independent given C. (see conditional probability)


For the definitions of causation used in the US legal system, please see legal definitions

Causality Graph


Causality is often described by a graph like structure. Nodes in the graph represent events, and arrows represent causation. That is, if there is an arrow between A and B, pointing from A to B, then A is represented as a cause of B.

In the following, A is a cause of both events B and C.



The first step of causal modeling is often to list the relevant variables, and to build a graph representing the assumed causations. When the causations are not known, some techniques can be employed to try to learn the causal structure.

Once a causal structure has been created, it may be necessary to learn the degree of association between cause and event. That is, not every cause is sufficient to guarantee that the linked event will occur. For example, a match is lit and thrown into a pile of kindling. If the kindling catches fire, then the lit match is surely the cause of the fire. However, the kindling may not always catch fire.

Lastly, when the causal structure is built, and the degree of association is learned, the structure may be used to perform calculations of an event occurring given the existence of its various causes.

Models


  • Probabilistic Graphical Models are used to represent a multi-variable probability distribution as a graph. The graph is a compact representation that can lead to improved computability and visualization. Causal models are represented by a Bayesian Network.
  • Granger Causality - a definition of causality often used in economics.