Correlation Still Isn’t Causation

Scott O. Lilienfeld

Letter/Observer Forum

Correlation Still Isn’t Causation

Scott O. Lilienfeld

February 1, 2006

Tags:

Academia

Log in to Save for Later

In his letter to the editor [Observer, December 2005], Justin M. Joffe took issue with my assertion in a “Teaching Tips” column [Observer, September 2005] that “distinguishing correlation from causation” is a crucial critical-thinking skill that all psychology instructors should impart. Joffe went so far as to label my claim “pseudoscientific” and “at best a half truth” and therefore “a half lie.” Lest puzzled readers of the Observer conclude that psychology teachers no longer need be concerned about teaching their students the difference between correlation and causation, we should examine Joffe’s arguments carefully.

In fact, Joffe’s claims are undermined by two serious errors in reasoning. First, Joffe asserts that it is often possible to rule out with considerable certainty the presence of plausible third variables in correlational designs, and that in such cases one can be “about as sure of causal relationships as you are with most experiments.” Here, Joffe commits the mistake, increasingly common in postmodernist circles, of neglecting to distinguish among research designs that occupy different locations in the hierarchy of evidentiary certainty. Joffe is correct that one can often generate plausible causal hypotheses from correlational designs, but he is mistaken that these designs frequently permit causal inferences as definitive as those of experimental designs. The central problem with conclusively inferring causality from correlational designs is that such designs are bedeviled by a plethora of potential third variables, which the investigator may not have measured or even thought to measure (Meehl, 1970). Joffe’s argument presumes omniscience on the part of the researcher — namely, that he or she is aware of all the relevant third variables that could complicate the interpretation of a correlational study’s conclusions; a presumption that is rarely, if ever, warranted.

Indeed, the very example Joffe proffered as an illustration — a positive correlation between babies’ birth weights (Variable B) and their maternal grandfathers’ incomes (Variable A) — reveals the flaw in his argument. Joffe appears to maintain that because Variable A precedes Variable B, we can conclude with reasonable certainty that A caused B, albeit indirectly. By drawing this conclusion, Joffe commits the post hoc, ergo propter hoc (after this, therefore because of this) error. The fact that A precedes B does not necessarily imply a causal relation between them. For example, the fact that the recent spate of television crime shows preceded the recent increase in Atlantic hurricanes does not imply that television crime shows produce hurricanes. Moreover, Joffe’s assertion that the correlation between babies’ birth weights and their maternal grandfathers’ incomes allows us “to rule out, with some plausibility, a third variable” ignores the myriad of viable-third variable explanations that could account for this finding. To take just one example, long-term familial residence in a grossly impoverished neighborhood marked by poor prenatal nutrition, could readily give rise to both low maternal grandfather income and low grandchild birth weight. Moreover, Joffe is incorrect that this correlation is incompatible with indirect genetic influences on both variables.

Second, Joffe argues that distinguishing correlation from causation presumes a simplistic model of causation, specifically, that the causal influence must be both necessary and sufficient. He is mistaken. As Meehl (1977) observed — in an article that should be required reading for all psychologists — causal influences span the gamut from strong (e.g., an etiological factor that is both necessary and sufficient) to moderate (e.g., an etiological factor that is necessary but not sufficient) to weak (e.g., an etiological factor that is neither necessary nor sufficient but that increases overall risk for a condition). There is nothing inherent in the term “causation” that forces us to adopt a simplistic model of etiology, nor does the didactic task of distinguishing correlation from causation preclude us from discussing more nuanced models of causality.

Joffe and I are in complete agreement that psychology instructors must do more than chant the mantra of “correlation isn’t causation,” and that they should encourage students to recognize that causation is a more complex and multifaceted concept than meets the eye. Nevertheless, I am happy to reassure concerned readers of the Observer who were about to discard their old statistics lecture notes that they can safely continue teaching their students to distinguish correlation from causation.

— Scott O. Lilienfeld
Emory University

References

Meehl, P.E. (1970). Nuisance variables and the ex post facto design. In M. Radner & S. Winokur (Eds.), Minnesota studies in the philosophy of science: Vol. IV. Analyses of theories and methods of physics and psychology (pp. 373-402). Minneapolis: University of Minnesota Press.
Meehl, P.E. (1977). Specific etiology and other forms of strong influence: Some quantitative meanings. Journal of Medicine and Philosophy, 2, 33-53.

Observer > 2006 > February > Correlation Still Isn’t Causation

Cookie	Duration	Description
at-rand	never	AddThis sets this cookie to track page visits, sources of traffic and share counts.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
uvc	1 year 27 days	Set by addthis.com to determine the usage of addthis.com service.
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_gtag_UA_3507334_1	1 minute	Set by Google to distinguish users.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.

Cookie	Duration	Description
loc	1 year 27 days	AddThis sets this geolocation cookie to help understand the location of users who share the information.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Letter/Observer Forum

Correlation Still Isn’t Causation

Related

Generational Shifts, New Methods, and the Future of Our Field

Navigating Academic Careers Across Borders

How to Set Up and Run an Undergraduate Research Lab

Related

Generational Shifts, New Methods, and the Future of Our Field

Navigating Academic Careers Across Borders

How to Set Up and Run an Undergraduate Research Lab

Cookies