Association and Causation | Health Knowledge

PLEASE NOTE:

We are currently in the process of updating this chapter and we appreciate your patience whilst this is being completed.

A principal aim of epidemiology is to assess the causes of disease. However, since most epidemiological studies are by nature observational rather than experimental, a number of possible explanations for an observed association need to be considered before we can infer that a cause-effect relationship exists. Specifically, causation needs to be distinguished from mere association – the link between two variables (often an exposure and an outcome). An observed association may in fact be due to the effects of one or more of the following:

Chance (random error)
Bias (systematic error)
Confounding
Reverse causality
True causality

A discussion of chance, bias and confounding can be found in the subsequent chapters and in the chapter “Sources of variation”.

Reverse causality describes the event where an association between an exposure and an outcome is not due to direct causality from exposure to outcome, but rather because the defined “outcome” actually results in a change in the defined “exposure”. For example, a study may find an association between using recreational drugs (exposure) and poor mental wellbeing (outcome) and thus conclude that using drugs is likely to impair wellbeing. A reverse causation explanation could be that people with poor mental wellbeing are more likely to use recreational drugs as, say, a means of escapism.

Judging Causality

An observed statistical association between a risk factor and a disease does not necessarily lead us to infer a causal relationship; conversely, the absence of an association does not necessarily imply the absence of a causal relationship.

A judgment about whether an observed statistical association represents a cause-effect relationship between exposure and disease requires inferences far beyond the data from a single study.

The Bradford Hill criteria, listed below, are widely used in epidemiology as a framework with which to assess whether an observed association is likely to be causal.¹

Strength of association – The stronger the association, or magnitude of the risk, between a risk factor and outcome, the more likely the relationship is thought to be causal.
Consistency – The same findings have been observed among different populations, using different study designs and at different times.
Specificity – There is a one-to-one relationship between the exposure and outcome. Note that this is uncommon in reality.
Temporal sequence – The exposure must precede outcome (to exclude reverse causation).
Biological gradient – Changes in the intensity of the exposure results in a change in the severity or risk of the outcome (i.e. a dose-response relationship).
Biological plausibility – There is a potential biological mechanism which explains the association.
Coherence – The relationship found agrees with the current knowledge of the natural history/biology of the disease.
Experiment – Removal of the exposure alters the frequency of the outcome.
Analogy – The relationship is in line with (i.e. analogous to) other established cause-effect relationships. For example, knowing of the teratogenic effects of thalidomide, we may accept a cause-effect relationship for a similar agent based on slighter evidence.

Although widely used, the criteria are not without criticism. Rothman argues that Hill did not propose these criteria as a checklist for evaluating whether a reported association might be interpreted as causal, but they have been widely applied in this way. He contends that the Bradford Hill criteria fail to deliver on the hope of clearly distinguishing causal from non-causal relations.²

For example, the first criterion 'strength of association' does not take into account the fact that not every component cause will have a strong association with the disease it produces, or that strength of association also depends on the prevalence of other factors.²

In terms of the third criterion, 'specificity', which suggests that a relationship is more likely to be causal if the exposure is related to a single outcome, Rothman argues that this criterion is misleading as a cause may have many effects, for example smoking.²

The fifth criterion, ‘biological gradient’ suggests that the plausibility of a causal association is increased if a dose-response curve can be demonstated.³ However, such relationships may also result from confounding or other biases.^2,3

According to Rothman, the only criterion that can be considered as a true causal criterion is 'temporality', that is that the cause precedes the effect. It may be difficult, however, to ascertain the time sequence for cause and effect.²

The process of causal inference is complex and arriving at a tentative inference of a causal or non-causal nature of an association is a subjective process. For a comprehensive discussion on causality, refer to Rothman.²

References

Hill AB. The environment and disease; association or causation? Proc R Soc Med 1965; 58:295-300.
Rothman KJ. Epidemiology: An Introduction (2^nd edition). Oxford University Press, 2012.
Lucas RM, McMichael A. Association or Causation: evaluating links between 'environment and disease'. Bull World Health Org 2005; 83(10): 792-795.