Why Preregistration Makes Me Nervous
I must admit that when I first heard of the effort to get psychological scientists to preregister their studies (that is, to submit to a journal a study’s hypotheses and a plan for how the data will be analyzed before that study has been run), I had a moment of panic. It seemed, on the surface, entirely too regulated for my tastes. I have since calmed down and now see the usefulness of preregistration — indeed, APS has been at the forefront of encouraging preregistration to make our science more transparent and reliable. Manuscripts accepted for publication in Psychological Science are eligible to earn three separate badges designed to promote open science (Eich, 2014). (Editor’s Note: Clinical Psychological Science now offers badges as well. See story on p. 13.). These are
- the Preregistered badge (for preregistering the design and analysis plan of the reported research and reporting the results as planned);
- the Open Materials badge (for making the components of the methods needed to reproduce the study publicly available); and
- the Open Data badge (for making the data needed to reproduce the reported results publicly available).
The badge system has recently been shown to improve open scientific practices.
But my initial reaction to preregistration still needs to be carefully examined and taken seriously, as others may have reacted (and continue to react) in the same way. Casual observation suggests that the Preregistration badge is the least used. For me at least, two concerns continue to keep me up at night.
The first is the fear that preregistration will stifle discovery. Science isn’t just about testing hypotheses — it’s also about discovering hypotheses grounded in phenomena that are worthy of study. Aren’t we supposed to let the data guide us in our exploration? How can we make new discoveries if our studies need to be catalogued before they are run?
The second concern is that preregistration seems like it applies only to certain types of studies — experimental studies done in the lab under controlled conditions. What about observational research, field research, and research with uncommon participants, to name just a few that might not fit neatly into the preregistration script?
This month’s column will be devoted to ruminating about the worry that preregistration will stifle discovery. Next month’s column will focus on the worry that preregistration applies only to a certain type of study.
APS Fellow Paul Rozin, following the late social psychologist Solomon Asch, reminds us that there are stages to conducting science. The first stage is devoted to discovering phenomena, describing them appropriately (i.e., figuring out which aspects of the phenomenon define it and are essential to it), and exploring the robustness and generality of the phenomenon. Only after this step has been taken (and it is not a trivial one) should we move on to exploring causal factors — mechanisms that precede the phenomenon and are involved in bringing it about, and functions that follow the phenomenon and lead to its recurrence.
Preregistration is appropriate for Stage 2 hypothesis-testing studies, but it is hard to reconcile with Stage 1 discovery studies. And according to both Rozin and Asch, psychological science has been too quick to move on to Stage 2 studies. We need to know that a phenomenon is robust and generalizable across conditions and participants (or restricted in interesting ways to a set of conditions and participants) before rushing to try to explain the mechanisms responsible for that phenomenon. Understanding a mechanism is worthwhile only if it is a mechanism that underlies a central aspect of our cognitive, social, and biological selves.
Jean Piaget’s work is a lovely illustration of the Rozin/Asch point that science proceeds first by discovering the phenomena that define us. Piaget’s observations about the steps children go through in learning how to conceptualize the world have defined the field of developmental psychology. The phenomena he identified are robust and generalizable across cultures and cohorts, and they need to be taken seriously by any theory purporting to account for development. Piaget’s own attempts at explaining these phenomena, at uncovering their mechanisms, have been seriously challenged. But the phenomena he described stand as behaviors whose development needs to be explained, and the field is still working on this task. Piaget’s initial studies — which could not have been preregistered and were, moreover, experimentally quite messy (more on this point in next month’s column) — have insured that carrying out subsequent experiments to probe causal mechanisms is worth doing.
Only after a phenomenon has been found to be robust is it worth exploring its causal mechanisms — and for that step, carefully controlled lab studies and statistical methods are essential. A question that thinking about the stages of science brings to mind is whether the statistical methods used in hypothesis-testing studies are appropriate for hypothesis-generating studies. If not, our field may need to invent new statistics to assess the reliability of phenomena discovered during Stage 1.
Preregistration is one way that we can make it clear to ourselves, to our reviewers, and to our readers what our hypotheses are. But we should not take preregistration as a goal unto itself. Before we can register our hypotheses, we need to discover them and to make sure that the mechanisms we seek are for phenomena that are worth explaining.
Asch, S. E. (1987). Social psychology. New York, NY: Oxford University Press. (Original work published 1952)
Eich, E. (2014). Business not as usual. Psychological Science, 25, 3–6.
Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., … Nosek, B. A. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLoS Biology, 14. doi:10.1371/journal.pbio.1002456
Rozin, P. (2001). Social psychology and science: Some lessons from Solomon Asch. Personality and Social Psychology Review, 5, 2–14.
Rozin, P. (2006). Domain denigration and process preference in academic psychology. Perspectives on Psychological Science, 1, 365–376.
Rozin, P. (2009). What kind of empirical research should we publish, fund, and reward? A different perspective. Perspectives on Psychological Science, 4, 435–439.
(I work at the Center for Open Science on preregistration, among other initiatives)
Thank you for voicing these concerns about preregistration. May I encourage the field to consider the relative costs and benefits of balancing the conflicting needs to reduce both Type 1 and Type 2 errors? That is, the risk of mistakenly claiming an effect is real versus mistakenly missing and unexpected effect? Detecting the Type 1 error rate is rather difficult and infrequently undertaken. So we must do what we can to measure and minimize it- for example, by rewarding replication studies.
Since the concerns that you raise are perhaps the most frequently cited concerns about preregistration, might I recommend some creative solutions to address them? For example, you could require reports using preregistered designs to report some number of un-registered analyses, thereby forcing researchers to “untie” their hands and granting permission to dig through the data to find something unexpected. Though I personally don’t think that is really necessary (there are ample rewards in the field for finding and publishing unexpected trends), those who are worried that preregistration will stifle innovation could perhaps recommend this or similar safeguards to help balance competing interests in scientific discovery and inference.
I’m happy to hash through any ideas that can help make more clear the process by which scientific discoveries are made and reported.
Thank you for your honest comment. I believe your worries are unfounded. I also used to be skeptical of preregistration (for reasons that I won’t go into) but I have since become convinced that it is only likely to improve our science. Preregistration does not devalue exploratory research. Quite to the contrary, it enhances it because you can clearly distinguish which aspects of a study are exploratory and which are confirmatory. Nobody is stopping you from conducting exploratory analyses in addition to your preregistered design. Nobody stops you from first conducting an exploratory study in which you search for regularities, and then confirming them in a preregistered experiment. In fact, this is precisely what preregistration is good for.
With regard to your second concern, yes there are probably many kinds of study where preregistration is not applicable. In that case, nobody suggests that it should be used. Although I’d wager that in many instances, it would be entirely possible to preregister a confirmatory test also in those cases (and again, this does not preclude including exploratory aspects in the study).
I do agree that there are risks with preregistration and only time will tell how they develop. For example, the quality of a basic preregistration (i.e. uploading an intro & methods document to OSF) can only be as good as the level of detail in the document. Moreover, if nobody actually checks how much deviation there is between the preregistration and the final study, then the preregistration is probably meaningless. Both these problems are why I think in the long run the only useful form of preregistration is by means of Registered Reports that are first peer-reviewed before the data collection begins.
With Registered Report I perceive other risks: for instance, peer reviewers could reject a study they perceive to have too much exploratory additions. This should never be a reason to reject a study provided the preregistered plan was followed unless the final paper makes outrageous claims unsupported by the evidence. Peer reviewers might also make unrealistic demands for changes during the initial review of the methods. I think both these problems can be prevented by good editing but there is certainly room for discussion on how to achieve this.
In any case, I don’t believe that these issues are severe enough, let alone that there is much evidence for them so far, that you should have sleepless nights.
Great to have the issue of preregistration brought to the fore by Susan Goldin-Meadow in her Presidential Column. But as the Editor of Psychological Science I would like to comment on two aspects of the column.
The definition of “preregistration” in the column is potentially confusing. The column defines preregistration as submitting a preregistration plan to a journal. But a researcher can preregister a study and then conduct it and later write it up and then submit it, pointing to the locked, date-stamped pregistration as evidence that the study was preregistered. All of the preregistered studies published in Psychological Science to date were of this sort. In my own lab my students and I are preregistering all of our projects, but we have not yet submitted a proposal to a journal before conducting the research. The journal Psychological Science does not, at this time, consider for publication proposals before the data have been conducted (although some other journals do, such as Cortex, and Perspectives on Psychological Science has published a number of important Registered Replication Reports that were accepted prior to data collection, and Psych Science has sometimes negotiated with authors to add a preregistered follow-up study). My point is that there is a big difference between (a) preregistration per se and (b) submitting a proposal to a journal prior to conducting the research. See http://www.psychologicalscience.org/index.php/publications/journals/psychological_science/preregistration for further information about preregistration.
Also, in my view preregistration works beautifully for exploratory work (field observations etc). True that at a very preliminary stage of such work the prereg would probably be very brief. For example, the young Jane Goodall might have pregistered something like “I plan to go to Gombe Stream National Park and unobtrusively observe chimpanzees, making notes of my subjective impressions of their behaviour; I have no planned measures and no a priori predictions; I just hope to learn about chimps by observing them in their natural habitat.” You might ask “What’s the use of such a vague preregistration?” but it is valuable precisely because it helps the researcher to remember (and the reviewers and editors to know) that the researcher did not go in with specified measures and hypotheses.
All of the concerns raised in this article (and flagged for the next column) have been addressed at length in the FAQs about Registered Reports on the Open Science Framework.
In particular, please see the dedicated section on “Scientific Creativity and Exploration”: https://osf.io/8mpji/wiki/FAQ%205:%20Scientific%20Creativity%20and%20Exploration/
I would also like to add that, having now edited over 25 Registered Reports for Cortex and Royal Society Open Science, I have seen first hand that it places no bar whatsoever on exploratory analyses, creativity or serendipity – authors are required only to distinguish the outcomes of exploratory analyses from pre-registered confirmatory analyses. And what is gained is an empirical report that is immune to publication bias, as free from researcher bias as any study can be, methodologically meticulous, and statistically rigorous.
In fact, far from suppressing exploratory science, I would argue that Registered Reports provide one of the few (and possibly the ONLY) format where exploratory analysis can be presented honestly in its native form – exploration – rather than being shoehorned into an ill-fitting confirmatory framework, a distortion that damages theory development and perverts our incentive structures.
Readers can see this for themselves in the Registered Reports that have been published so far at Cortex
It is shame to see the APS Presidential Column used to advance views that are clearly based on no experience whatsoever of pre-registration in either an authorial or editorial capacity, and little attempt to study the published counterarguments to these concerns, which have been publicly available now in multiple forums for over 3 years.
Statistical tools such as p-values and confidence intervals are meaningful only for strictly confirmatory analyses. In turn, preregistration is one of very few ways to check and confirm that the presented analyses were indeed confirmatory. Two conclusions follow:
(1) Researchers who do exploratory work cannot interpret the outcome of their statistical tests in a meaningful way. The problem is one of multiple comparisons with the number of comparisons unknown (De Groot, 1956/2014) — in other words, cherry-picking.
(2) Researchers who wish to interpret the outcome of their statistical tests in a meaningful way are forced to preregister their analyses. Preregistration is the price one pays for being allowed anywhere near a statistical test.
Exploratory research will always have a place and is absolutely essential for scientific discovery. But too often exploration masquerades as confirmation. For example, if you’re doing exploration, inferential statistics are out of place. Our statistical tests are designed as confirmatory–you conduct the key hypothesis test you planned ahead of time. On the other hand, if you explore the data and choose to present the most meaningful, interpretable, or significant results, p<.05 is largely meaningless with regard to the risk of false positives.
I too had these concerns about preregistration. But I’ve come around to the idea and advocate for submitting at least a minimal preregistration for every study done in my lab:
Note that, as Stephen Lindsay writes above, we’re talking about the registration of a protocol, not a submission for peer review prior to data collection. The second is much more expensive in terms of time and effort and (I believe) not necessarily appropriate for every project.
Integrated into my standard advising practice, I now see preregistration as nothing more than a practice that we carried out informally in meetings and lab notebooks: specifying outcomes of interests, sample sizes, and predictions ahead of time. The value of new technical tools is that we can remind ourselves of these after the fact so as to keep from being led astray by treating exploratory discoveries as confirmatory hypothesis tests.
Not surprisinglly, I am in full agreement with Susan Goldin-Meadow’s concerns about possible overestensions of the idea of preregistration. So long as it is not a condition for publication, as opposed to a “badge”, I suppose it can be useful in some conditions. Preregistration of exploratory work, as presented in one of the comments, would not work in my opinion. Most of my own work is exploratory, and one of the wonderful things about such work is that I often change my mind about what I should be looking for or at.The freedom to follow where the phenomena lead one is very important. Fortunately, Pavlov was able to do that. In my own experience, which is largely but not entirely in survey research, the first post exploratory study, which would be a candidate for preregistration, turns out most of the time to be an unanticipated pilot study, which reveals variables, conditions, ambiguities that had not been anticipated. And when you think you have done it “right”, with preregistration, it is usually the case that reviewers point out that you really haven’t done it correctly.
I think a fair percentage of the most influential studies in psychology could not have been preregistered. They involve surprises. Of course, in a well studied area, where the score is 20 studies supporting an hypothesis, and 20 against, where the design has been well explored, preregistration is likely to be more effective.
Paul Rozin U of PA
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.
Please login with your APS account to comment.