A week or so ago, I wrote up some new research showing how easy it is for psychological scientists to falsify experimental results. The point of the report, published on-line in the journal Psychological Science, was not that researchers are deliberately, or mischievously, reporting bogus findings. The point was instead that commonly accepted practices for reporting and analyzing data can lead inadvertently to invalid conclusions. According to the authors of the paper, scientists at the University of Pennsylvania and Berkeley, the commonly accepted “false positive” rate of 5 percent could in reality run as high as 60 percent if all these practices come into play. The report was provocative and worrisome.
This write-up, which I also put on The Huffington Post, got a fair amount of attention on Twitter and elsewhere. But it didn’t get a lot of attention in the mainstream press. When I spoke to science journalists about the study and its implications, the typical response was: Well, it’s one thing to show that such misrepresentations are possible, and quite another to show that they actually are prevalent. In other words, is there really any reason for the consumer of behavioral science to be wary of the entire enterprise, or is this inside baseball?
Fair enough. That wasn’t addressed in the study. But now comes word of a new study, also slated for publication in Psychological Science, which does look at precisely this question. And the results are no less worrisome. Three scientists—Leslie John of Harvard Business School, George Loewenstein of Carnegie Mellon, and Drazen Prelec of MIT—surveyed more than 2000 psychological scientists at major U.S. universities about the use of “questionable research practices” in their work. These practices are very similar to the ones addressed in the earlier study: They include failing to report all of a study’s dependent variables; deciding to collect more (or less) data during the course of a study (what’s called “testing until significant”); and reporting an unexpected finding as if it were predicted from the start. The scientists also asked about the prevalence of more serious “scientific felonies”: claiming that results are unaffected by demographic variables, when they know they are; and outright falsifying of data.
The scientists did not expect their subjects to be forthcoming about scientific fraud, or even for that matter about the “gray zone of accepted practice.” So they devised a new anonymous survey methodology with explicit incentives for truth-telling. It’s a bit complicated, but basically it promises charity donations depending on the truthfulness of responses, which is determined by an algorithm known as the “Bayesian truth serum.” By creating the (correct) belief that dishonest responses will hurt the respondent’s charity of choice, the scientists boosted the moral stakes riding on each answer—and presumably got a more honest picture of actual laboratory practices.
It’s not an entirely positive picture. One in ten research psychologists appear to have actually falsified scientific data, and the majority have engaged in some of the more ambiguous “questionable” practices. These numbers are higher than previous studies have suggested—no doubt due to this study’s truth-telling incentives. The survey also asked which of these practices were “defensible.” Overall, the respondents rated these practices somewhere between “possibly defensible” and “defensible.” And unsurprisingly, those who engaged in the practices were more likely to rationalize them. A relatively large proportion of respondents also admitted to doubts about the integrity of scientific research on at least one occasion.
The authors emphasize that, while outright falsification of data is never justified, many of these other practices can be defended under some circumstances. Indeed, scientists often appear to engage in these practices unknowingly, and in that sense, they are not even misdemeanors. Yet other justifications in this gray area were what the authors label “contentious”: For example, dropping dependent measures to tell a “more coherent story” or to increase the likelihood of publication. Indeed, scientists’ justifications for these practices are often a by-product of the pressure to publish, the authors conclude. The inherent ethical ambiguity of scientific evidence may cause researchers to delude themselves that it’s okay to ignore nuisance data.
The authors believe this is a significant problem for the field, one requiring attention. These common practices, even if understandable, are wasting researchers’ time and stalling scientific progress, because researchers are fruitlessly trying to build on results that are not real and won’t replicate. Even more “disheartening,” they conclude, is that “unrealistically elegant results” can only be matched by using more of the same dubious methods, creating a “race to the bottom.” In short, these practices have become “the steroids of scientific competition, artificially enhancing performance while providing considerable latitude for rationalization and self-deception.”
Wray Herbert’s book, On Second Thought, is out in paperback. Excerpts from his two blogs—“We’re Only Human” and “Full Frontal Psychology”—appear regularly in Scientific American and in The Huffington Post.