Letter/Observer Forum

Use Big Samples Online to Increase Replicability

Hostess Brands — makers of Wonder Bread, Twinkies, Ding Dongs, and other products — recently filed for bankruptcy. There are many complex reasons the company ran into trouble, but there is also an obvious one: If a company wants to deliver more bread to the shelf, it has to pay for more workers, equipment, supplies, shipping, and so on. Food businesses do not easily increase in scale.

Google, on the other hand, is not considering bankruptcy. In 2004, Google received around 300 million search queries per day. As of 2011, that number had increased by more than an order of magnitude, to over 3 billion. A search algorithm that works for one search can work just as well for 10. There are costs associated with increasing a website’s scale, but they are relatively minor compared with those of most businesses. Online businesses are relatively easy to scale.

As a psychological scientist, my research used to resemble the Hostess business model. To run twice as many participants in a study generally required that I and/or my research assistants spend twice as much time in the lab, and I had to wait about twice as long to get the data.

Now I run the majority of my studies online, mainly using Amazon’s Mechanical Turk to recruit participants. If I offer participants $2.00 for a 30-minute study, I can reliably get 20, or 200, or more people to complete my study within 24 hours. In other words, my research now scales like Google’s search. If I want to run more participants, I just type a larger number into the box on the Amazon webpage. The costs exist — I have to sort out (and sometimes code) the data, and I have to pay more participants — but they tend to be minimal compared with the costs of running studies in the lab.

The world of psychology is awash with concerns about replicability. One way to increase replicability is to run more participants. Running studies online makes doing so eminently feasible. Those of us who began our careers running subjects in the lab tend to have a Hostess mindset: We don’t want to run any more participants than we need to. If I would have run 40 participants in a study I conducted in 2005, my first instinct is to do the same in 2013. Our science would benefit from adopting a Google mindset: If I would have run 40 participants in 2005, why not run 100 in 2013? In addition to increasing statistical power, running more participants can also avoid Type I or Type II errors.

I recently reaped the benefit of using a large sample. I conducted a study that looked very promising after we had collected data from about 40 participants. Having recently read about the dangers of running subjects until p < .05 and then stopping, I decided to run more participants. Unfortunately, the effect faded away; fortunately, I found out the truth. Adopting a Google mindset, and using larger samples online, will not solve all replicability problems. But it can help.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Comments will be moderated. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.

Please login with your APS account to comment.