Why Correlation Matters

Correlation: Causation’s Main Ingredient

“Correlation doesn’t equal causation!” It’s the rally cry against dubious science and irrational claims. And it’s true. But just because correlation doesn’t equal causation doesn’t mean correlation isn’t critical for establishing causation.

Correlation doesn’t equal causation just as an egg doesn’t equal an omelette. More ingredients are required, but this is the major ingredient, and you can’t have the result without that major ingredient.

What is Correlation?

Correlation is how scientists refer to a relationship between two things of interest, usually referred to as variables. Variables include just about anything you can imagine that varies and can be measured. Height and weight are two examples of variables, and these variables are also related. As height increases, weight tends to increase. Similarly, height and sex are related. Men tend to be taller than women.

These relationships can be positive (when one increases/decreases, the other does as well) or negative related (when one increases, the other decreases). The way the two variables are related does not matter. What matters is that when one changes, the other also changes in a predictable pattern.

Establishing this relationship known as correlation is the first step to establishing causation i.e. that one variable is causing something to happen with the other variable.

Does above average height cause above average weight? Or is it just a coincidence?

Here is where things get tricky. Relationships can be causal, or not. A causal relationship means that one variable caused the other to change. Sometimes, however, a third variable caused the relationship to exist. And sometimes relationships exist for no reason at all. These are called spurious relationships.

For example, there is a correlation between the number of letters in the winning words of the Scripps National Spelling Bee and the number of deaths from venomous spiders. During years with longer winning words, more individuals die from poisonous spider bites. However, one does not cause the other. And there is no connected variable that explains this relationship. This correlation is spurious, not causal.

How do you know the difference?

Establishing correlation means that scientific research has confirmed that two things are related. When one changes, the other changes in a predictable way. Establishing causation means that scientific research has confirmed that one of these variables causes the other to change. We cannot establish causation (that one variable is responsible for changing the other) until we first establish correlation (that changes in the two variables occur together.) Just as we cannot make an omelette without an egg, we cannot establish causation without the main ingredient: correlation.

Many research methodologies are focused exclusively on studying whether or not correlation exists. This is especially true for researchers who are focused on studying whether or not certain substances are harmful. If these researchers study whether or not a relationship exists and, after following requirements for high quality studies are unable to find a relationship, it can be concluded scientifically that some other substance or group of substances is responsible for this effect, not the substance in question.

This is the exact approach our team is using to evaluate the claims that lavender and tea tree oils cause endocrine disruption in young boys. This claim proposes that exposure to these oils through skincare products causes young boys to grow breast buds and may pose a threat to young girls as well. Our team is using two high quality research designs to establish correlation. If, after completing this project, correlation cannot be established, then it is clear that the oils do not cause these outcomes. No correlation = no causation.

How do you establish correlation?

A relationship between two variables has to be identified scientifically; it usually cannot be simply observed. Statisticians have to collect data on the two variables and other variables that may create the false impression that these two variables are related when they are not. (These are called confounding factors.) Once the researcher controls for these confounding factors, the researcher can then establish whether or not a relationship actually exists.

Several different types of epidemiological studies are designed to look for correlation, not causation. These include cohort studies, case-control studies, cross-sectional studies, and many population-based studies. While establishing correlation does not fully establish causation, causation does not exist without correlation.


Get Updates

Sign up for Franklin Health Research Foundation news & updates to get the latest cutting edge research.