Psychological Science Editor in Chief D. Stephen Lindsay is promoting a variety of ways to solidify the scientific rigor in articles published in APS’s flagship journal. Lindsay was named Editor in Chief this past spring after serving as Interim Editor for nearly a year. The Observer recently asked him about his plans for the journal.
Observer (OBS): Are there some steps you had deferred while you were Interim Editor that you now want to take as Editor in Chief?
D. Stephen Lindsay (SL): Yes, there are several actions that I thought would be presumptuous to make as Interim Editor. As one example, authors are now required to indicate when submitting a manuscript whether or not the data files required to replicate the analyses reported in the submission are available to reviewers and, if not, to explain why not. Sometimes there are legitimate reasons that some or all of the data cannot be publicly posted, but whenever it is appropriate to make data available to reviewers, I want to encourage that practice. As another example, I plan to take steps to encourage preregistration of research. Under some conditions, Psychological Science editors will work with authors on a case-by-case basis to agree to a preregistration plan, with the understanding that the outcome will be reported regardless of whether or not it evidences an effect.
OBS: As part of your effort to increase replicability in the studies published in the journal, you’ve urged editors to be on the lookout for evidence of p-hacking, interpreting correlations from small samples, and misinterpretation of nonsignificant results. Are you finding that these problems are large and persistent?
SL: I believe that it will take some years to educate the field on these matters. A core part of the problem, I suspect, is that intuition (even intelligent intuition) leads us far astray when it comes to statistical power. Psychologists have known for decades that people are overly influenced by small samples, but many of us have relied on faulty intuitions to gauge how large of a sample is needed to get reliable tests of hypotheses. At the 2016 APS Annual Convention in Chicago, [APS Fellow] Scott Maxwell showed that across between-subjects studies with 20 subjects in each of two conditions that observe an effect size of Cohen’s d = 0.70 (a large effect), the modal value of the true population effect size is d = 0.19 (a smallish effect). When sample sizes are small, only studies that yield exaggerated estimates of the true effect size can be statistically significant. I think many psychologists are trying to make adjustments, but they don’t always realize how effect size and sample size interact. Bakker et al. just published an article in Psychological Science on psychologists’ intuitions regarding sample size and statistical power, and it is pretty clear from that study that we have some way to go.
OBS: Your predecessor, APS Fellow Eric Eich, pushed for increased attention to statistical power, transparent scientific practices, and consideration of alternatives to null-hypothesis significance testing. Do you feel scientists submitting papers to the journal are responding to those standards?
SL: Eric Eich’s initiatives have us moving in the right direction, and awareness and concern about these issues continue to grow. As one bit of evidence for that, Kidwell et al. recently published evidence that the rate of posting data has dramatically soared for articles published in Psychological Science since the badges were instituted. I also note that at the recent APS convention, the sessions on open science and statistics were absolutely packed, with people sitting on the floor in the aisles and standing at the back of the room. But uptake of this stuff takes time and is uneven across the field. For example, one of Eric’s initiatives was to require authors to explain, as part of the manuscript submission process, how they determined their sample sizes. I haven’t done a formal analysis, but my casual impression is that the modal response is along the lines of, “We tested N because that is typical in this research domain.” That may be fine in some domains, but we have good reason to believe that many research domains have histories of publishing underpowered studies that consequently exaggerate effect size, so a precedent is often a weak basis for determining sample size. Others say that they did a power analysis using an effect size of Cohen’s d = 0.50 but give no hint of why that is a solid estimate of effect size. So we are making progress, but it is likely to be gradual. As they say in Arabic, shwaya shwaya, or “little by little.”
OBS: Early this year, you assembled a pool of Statistical Advisors to help ensure that Psychological Science submissions are justified by the methods and statistics used. How much have you called on these advisors so far, and what kind of results are you seeing?
SL: The Psychological Science Statistical Advisors have been super helpful to me and to several of the journal’s Senior and Associate Editors. In some cases, the Statistical Advisors simply serve as reviewers with particular emphasis on
stats/methods. In other cases, it is only when preparing to write an action letter that an editor discovers that s/he is uncertain about statistical or methodological issues. In such cases, it is terrific to have appropriately qualified people who are willing and able to comment on a specific matter in a timely way. Also, statistical issues sometimes are brought to my attention after an article has been published in Psychological Science. There, too, several of the Statistical Advisors have been very helpful. Maybe I shouldn’t say this aloud, but part of the Statistical Advisor role is
symbolic — not an empty symbol, but I want to shout from the rooftops that Psychological Science is committed to scientific rigor.
OBS: If there is one precedent-setting achievement you’d like to make during your tenure as Editor, what would it be?
SL: I’d love to see Columbia University stats professor Andy Gelman proclaim in his widely read blog that Psychological Science is setting the standard for scientific rigor.
Bakker, M., Hartgerink, C. H. J., Wicherts, J. M., & van der Maas, H. L. J. (2016). Researchers’ intuitions about power in psychological research. Psychological Science, 27, 1069–1077. doi:10.1177/0956797616647519
Kidwell, M. C., Lazarević, L. B., Baranski, E., Hardwicke, T. E., Piechowski, S., Falkenberg, L.-S., … Nosek, B. A. (2016). Badges to acknowledge open practices: A simple, low-cost, effective method for increasing transparency. PLoS Biology, 14. doi:10.1371/journal.pbio.1002456