Discerning Discoveries

How people gauge the magnitude of science

People make ill-informed decisions every day. They buy costly food supplements that have measurable but minimal health benefits. They reject medicines that carry the slightest of safety risks.

Many public health and safety officials blame misinformation for driving people toward risky choices and useless expenses. But research points to another factor—the nebulous ways that science is communicated. News outlets often report scientific findings without contextualizing the impact of the results, which may lead their audiences to overestimate the magnitude of the discovery.  

“If people erroneously assume that scientific findings are practically significant, they may adopt ineffective health, dietary, and other lifestyle interventions on the basis of limited information,” two University of Michigan researchers wrote in a recent Psychological Science article

The authors, research psychologists Audrey L. Michal and Priti Shah, explored how people comprehend and react to statistical information—or the lack of it. They emphasized the importance of reporting not just the effect of an intervention, but its magnitude.  

Across two studies, Michal and Shah found that people tend to assume an effect is meaningfully large when they don’t see actual numerical data. For instance, they may read that an experimental treatment group fared better than a control group, but they don’t think to ask, “How much better?” 

By contrast, people make more sensible decisions when they understand the statistical risks and benefits, scientists have discovered. Researchers who study risk perception have found, for example, that showing patients the numeric risk of a medication’s side effects increases their willingness to use the drug as prescribed (Peters et al., 2014).  

Weighing the value of curriculum  

In the first of their two studies, Michal and Shah recruited nearly 400 online participants, divided them into four groups, and randomly assigned them to read one of three versions of an article about a fictional education research study. That study involved more than 500 4th graders who took a national math test at the beginning and the end of the school year. During that year, half the students completed a new math curriculum—one that cost considerably more than the regular curriculum. 

In all three versions of the article, only 25% of the students who received the old curriculum performed a proficient level at both the beginning and end of the year, while the new curriculum sparked some gains in the students’ collective math performance. 

But the participants’ attitudes about the new curriculum varied depending on which version of the article they read:  

  • In the “no-effect-size” version, the article stated that an unspecified percentage of students completing the new math program showed improved proficiency.  
  • In the large-effect-size version, the article stated that proficiency rates climbed 10% for students who completed the new curriculum.  
  • In the small-effect-size version, proficiency rates among the students receiving the new curriculum rose 2% by year-end.  

The small-effect-size condition was assigned to two different participant groups. A “prompt” group was asked to rate whether the proficiency bump was meaningful and worth the new curriculum’s cost. Participants in a separate “no prompt” group were not asked to provide the rating.  

After reading the article, all participants rated their likelihood of recommending the new curriculum to the school system. Lastly, they completed a measure of their numeracy skills—the ability to use math in real-world situations.  

Numeracy skills, the study showed, influence magnitude perceptions. Overall, participants in the no-effect-size and large-effect-size groups were more likely to recommend the curriculum than those in the two small-effect-size groups. But in those latter groups, low numeracy scorers were more likely to endorse the curriculum despite the trifling proficiency gains. 

“We initially believed that drawing those participants’ attention explicitly to the small magnitude of the result would decrease their enthusiasm for the intervention,” Michal told the Observer. “But it did not seem to be enough to make some people realize that small effects are less meaningful.”   

Importantly, only 10% of the participants in the no-effect-size group mentioned independently that the article lacked any numerical information about the curriculum’s impact.  

Is it worth the energy?  

Michal and Shah replicated the findings in a second study. They recruited another 400 online participants and instructed them to read about a study of a new energy bar (with “cara berries”) that boosted mental alertness but cost twice as much as regular energy bars. In this fictional study, 340 individuals were randomly assigned to eat either a conventional bar or the new cara berry energy bar every day for a week. They were asked about their mental alertness at both the beginning and end of the study period. Participants who ate the cara berry bars reported sharper mental alertness by the end of the week compared with those who ate the regular bars.  

Similarly to Study 1, Michal and Shah randomly showed the participants different versions of the fictional study’s results. One version showed cara berry bar eaters reporting an average 0.06-point increase in mental alertness, another version showed an average 1.5- to 3-point increase, and a third showed an unspecified level of improvement. Those in the small-effect-size “prompt” group were asked to rate how meaningful they found the new energy bar’s effects and whether that improvement justified the product’s price.  

Study 2 yielded similar results as the earlier study, with the small-effect-size groups rating the product significantly lower than the other two groups. Also, those scoring low on numeracy skills showed more enthusiasm for the new bars even if they read that the improvements to mental alertness were small.  

Overall, the research demonstrated that people tend to assume unspecified gains from an intervention are meaningfully large.  

“Our results suggest that describing scientific findings only in general terms can be potentially misleading,” the authors wrote. “People may erroneously act on trivial effects because they assume that those findings are practically meaningful, similar to how general claims in advertisements can mislead consumers by implying that a product contains meaningful amounts of a substance when in fact the amount is trivial.”  

Thus, scientists and the media can help consumers make more-informed decisions by reporting effect magnitudes rather than just general effects, the researchers added. And they can translate numerical effects into concrete terms for people who struggle with magnitude judgments.  

Indeed, other researchers have demonstrated strategies to illustrate effect sizes. A review of studies from 60 countries, involving nearly 28,000 participants with varying levels of numeracy and graphic literacy, showed visual aids that illustrate risk information can help people make informed health and medical decisions (Garcia-Retamero & Cokely, 2017). Another group of researchers developed a set of automated tools to help people grasp unfamiliar measurements by comparing them to familiar objects (e.g., a small increase could be illustrated as the length of a guitar pick relative to the length of a double bed) (Hullman et al., 2018).  

Although Michal and Shah replicated their findings from the first study, they noted that their experiments involved relatively familiar contexts. Changes in magnitude may be less intuitive in other contexts, such as tiny global temperature changes that have major impacts on weather patterns. The researchers called for further research on the ways people interpret magnitude information in various situations.  

However, Michal and Shah said their findings suggest that reporting the significance of effects can help people make better decisions about their health, their finances, and more. A scientifically proven practice or intervention may not be worth the cost, time, or effort to adopt if its benefits are trivial. 

Feedback on this article? Email [email protected] or login to comment.

References 

Garcia-Retamero, R., & Cokely, E. T. (2017). Designing visual aids that promote risk literacy: A systematic review of health research and evidence-based design heuristics. Human Factors, 59(4), 582–627. https://doi.org/10.1177/0018720817690634 

Hullman, J., Kim, Y. S., Nguyen, F., Speers, L., & Agrawala, M. (2018). Improving comprehension of measurements using concrete re-expression strategies. In CHI ’18, Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, Association for Computing 

Machinery (Paper 34, pp. 1–12). https://doi.org/10.1145/3173574.3173608 

Michal, A. L., & Shah, P. (2024). A practical significance bias in laypeople’s evaluation of scientific findings. Psychological Science.  https://doi.org/10.1177/09567976241231506 

Peters, E., Hart, P. S., Tusler, M., Fraenkel, L. (2014). Numbers matter to informed patient choices: A randomized design across age and numeracy levels. Medical Decision Making, 34(4), 430–442. https://doi.org/10.1177/0272989X13511705 


APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.

Please login with your APS account to comment.