Addressing Selection Bias in Disparities Research

Psychological research often focuses on disparities, but how do the populations studied impact the reliability of findings?
In this episode of Under the Cortex, Özge Gürcanlı Fischer Baum hosts Wen Wei Loh and Dongning Ren who recently published an article on this topic in APS’s journal Advances in Methods and Practices in Psychological Science. They discuss how non-representative samples can influence study conclusions and highlight solutions for strengthening study validity.
Send us your thoughts and questions at [email protected].
Unedited Transcript
Addressing Selection Bias in Disparities Research – Association for Psychological Science – APS
Unedited Transcript
[00:00:07.320] – APS’s Özge Gürcanlı Fischer Baum
Psychological research often depends on statistical testing, but how do psychologists choose the populations they study, particularly when focusing on disparities? Can using non-representative samples affect internal validity? How do inclusion and exclusion criteria shape the reliability of scientific conclusions? I am Özge Gürcanlı Fischer Baum with the Association for Psychological Science. In today’s episode, I’m joined by Wen Wei Loh and Dongning Ren from Maastricht University, who recently published an article on this topic in APS’s journal, Advances in Methods and Practices in Psychological Science. Together, we are going to address the critical role of selection bias in disparities research. Wen Wei and Dongning, thank you for joining me today. Welcome to Under the Cortex. I want to start with our first question. What type of psychologists are you?
[00:01:08.660] – Dongning Ren
I’ll go first. I’m Dongning Ren. I’m a social psychologist. Currently, I research social experiences from a social justice perspective. What motivates me in my work is to identify and address systemic barriers to diversity and inclusion.
[00:01:25.230] – Wen Wei Loh
My name is Wen Wei Loh. I’m a statistical methodologist interested in bringing causal modeling to various disciplines, including psychology. What motivates me, it’s actually working with applied researchers to answer causal questions in the social and health sciences.
[00:01:41.620] – APS’s Özge Gürcanlı Fischer Baum
Yeah. And then my question is for you, what initially got you interested in studying group-based disparities?
[00:01:51.610] – Dongning Ren
Yeah, I’m always interested in group-based disparities based on my own lived experiences as a minority member. But what really got me interested in studying the topic as a psychologist was a textbook on social psychology. This textbook is authored by two wonderful sociologists, Cathy Johnson and Karen Hegtvedt at Emory, who I got to meet during my visit at Emory University. The book is titled Social Psychology: Individuals, Interaction, and Inequality. I’m sure there are many excellent social psychology textbooks out there What makes this one unique to me is that it not only talks about social psychological theories, but also how macro-level processes, the interpersonal processes, such as the social exclusion and discrimination, translate into macro-level outcomes in society. It really gave me a really fresh perspective on social psychology and how we can make use of the social psychological theories to address larger-scale societal problems, for example, inequality, reading that book was truly an enlightening experience.
[00:03:07.860] – APS’s Özge Gürcanlı Fischer Baum
Yeah. Wen Wei, I’m turning to you. You said you are a methodologist and you enjoy working with applied psychologists. What did you get interested in studying group-based disparities with your colleague, Dongning?
[00:03:23.800] – Wen Wei Loh
Yeah, I think just like she mentioned, I think lived experiences definitely influence some of the work that I’ve done. She pointed out to me that there is this body of work on group-based disparities, and I think we can identify some challenges to drawing causal conclusions based on the existing literature. We thought maybe it might be interesting to apply research to learn about how you can answer the causal questions better, I guess.
[00:03:53.440] – APS’s Özge Gürcanlı Fischer Baum
Yeah. So helping with the causality with your methodological power. Yeah, wonderful. In the paper, you mentioned the critical role of selection bias in disparities research. For those unfamiliar, could you provide a simple explanation of what selection bias is and why it poses such a challenge?
[00:04:13.650] – Wen Wei Loh
That’s a great question. I’m going to start with a very specific definition. What selection bias is generally, and we can get into why it matters for disparities research. Let’s start with your analytic sample. You have gathered your observations, you have collected data from them, and you calculate an association between some predictor, such as a treatment, and the outcome using data from these individuals. But there is also this true unknown association from the population that’s eligible for your study before they were actually selected into your analytic al sample. So selection bias occurs when these two associations, one from your analytical sample, one from your population before selection actually occurred. When these two associations systematically differ, that’s when selection bias occurs.
[00:05:07.790] – Dongning Ren
I think that’s a very precise definition. I’m not a statistician, so I think about selection bias in less technical terms. When I think about selection bias, it’s that selection bias occurs when some individuals in the target population are more likely to be selected into the study sample than others. Then when they end up with the non-representative sample. When we think about a non-representative sample, we often talk about generalizability and external validity in psychology. But the problem of a non-representative sample doesn’t just stop there. It could hurt internal validity as well. That means any conclusions we draw using that sample could be simply wrong. By wrong, I mean the effect estimate could be all over the place. It could be bigger or smaller than the true effect or even in the wrong direction. One thing about selection bias that I should add is that it’s a really broad issue. It could happen for different reasons. It could happen because the researcher, the They’re searching to use a certain criteria, inclusion or exclusion criteria, when they process their data, where they chose a certain recruitment strategy at the data collection stage, where they chose a study site that’s more accessible to some participants in others.
[00:06:31.780] – Dongning Ren
It could also happen because of participants, because participants themselves could self-select into some studies based on their interests, background, preferences, and attitudes. These are all different factors why it could occur. It’s not surprising, it’s very broad. It could happen in different fields and research areas. Definitely not just in psychology or disparities with Research. To give you a quick example, let me say if we want to study to understand how sleep quality affects physical health, we might end up with a sample of relatively healthy participants because those who struggle with the healthy issues are less likely to participate.
[00:07:19.570] – APS’s Özge Gürcanlı Fischer Baum
Yeah, that’s a great supporting example. Like you said, a non-representative sample becomes a statistical problem right away. It can cause internal reality issues, wrong conclusions. Yeah, thank you for that clarification. In your paper, specifically, you mentioned that several key dimensions like gender, race, ethnicity, and socioeconomic status is fundamental causes of inequality. How do these factors interact with selection bias in research?
[00:07:51.040] – Dongning Ren
That’s a very interesting question to think about. Just to rephrase that question a little bit, you’re asking how selection bias manifests in disparities research. That’s a really good question. That’s the core part of the paper. But actually, we didn’t start with selection bias. Try to think about how selection bias affects disparities research. It This is, in fact, the other way around. We were working on a project about inequality, and we were digging into the broad literature on inequality. What caught our attention was that there are There are two very popular research designs in the field. I will talk about what those designs are in a second. But no matter what social dimensions researchers are interested in, it could be gender, race, SES, what have you. No matter what outcomes they’re looking at, it could be health outcomes, social outcomes, economic outcomes. But what we saw was there are these two designs that are widely used, really intuitive, really popular. But if you take a closer look, both the designs are subject to selection bias. We felt, well, this is a problem because inequality research is really important. Inequality is the There is a serious societal issue.
[00:09:16.980] – Dongning Ren
The validity of the conclusions matter tremendously. It matters for the experiences of the vulnerable populations. It can affect how a layperson or the general public perceive the inequality. That’s how media talks about it. It could also affect decisions of organizations, government, and the policymakers. That’s how we decided, Well, let’s talk about selection bias in disparities research.
[00:09:51.630] – APS’s Özge Gürcanlı Fischer Baum
You highlight two specific study designs that can lead to selection bias, outcome dependent and outcome associated selection. Could you walk us through what each of these terms means and how they differ?
[00:10:07.610] – Dongning Ren
Yeah, absolutely. I’ll start with the first one, outcome dependent selection. Let’s think about Let’s put that as an example together. Say APS gives an award every year. Now we’re going to find out what is the gender gap between women and men in receiving this the award. What we could do is that we compile a list of award winners and check the numbers of women against the number of men. Say, when now we find out there had been 50 winners identified as women and 50 as men, 50 and 50. Now, this doesn’t mean there’s no gender gap. What if we found there were 60 women and 40 men Okay. Does it mean women were more likely to be awarded? The answer to both questions is no. If you have 50/50, it’s not evidence for gender equality. If you have women and 40 Also, it doesn’t mean that women were more likely to be awarded. The reason is that to examine, to quantify gender gap, we can’t just look at the winners. That is the selected sample. What we need to look at is the population. All the psychologists who are eligible for the award. That includes both winners and those who didn’t win.
[00:11:39.740] – Dongning Ren
If you look at the population who could calculate a chance of winning among women in the chance of winning among men. Comparing these two numbers could tell us what is the gender difference in winning the award. Did women and men have equal opportunity to win the The design, just to talk about the hypothetical example about where we only look at winners, it’s a very popular and intuitive design. It’s quite intuitive, but attempting to just look at a sample that has a certain outcomes. Depending on what you’re interested in as an outcome variables, you could look at those who have gotten the award, those who got funding, those who got hired, were promoted, Now, just look at selected example and see how the numbers differ across different social groups. But the design here is problematic because of selection bias. We refer to this design as the outcome dependence selection. So that’s the first one. The second one is called outcome associated selection. That one is a bit trickier. Wen Wei, would you like to explain what is the outcome associated selection?
[00:12:59.070] – Wen Wei Loh
Sure. Let me try to give a conceptual explanation. So in outcome dependent selection, the outcome influences or causes selection. So before we heard about you only select winners. Winners, that’s your outcome, so you only select those who won . Now, in outcome associated selection, the outcome and selection are simply associated with each other because they share a mutual or common cause. So I’m going to use the same example in our paper as well. Now, Suppose instead of just selecting winners, which was outcome-dependent selection, you selected a sample of individuals who were nominated for the award. So this, of course, consists of both winners and non-winners, eventually. We’re still interested in the chances of winning the award, which is the same outcome. But then, factors such as the prestige of the doctoral program or degree-granting university, research area, academic record, professional network, all of these could influence both being nominated, which is how we selected these individuals, and winning the award, which is the outcome. So such common causes, when they are left unmeasured, result in outcome and selection, being associated with each other, and that brings about outcome associated selection.
[00:14:20.280] – APS’s Özge Gürcanlı Fischer Baum
Yeah. What I hear from both of you is that basically the need for a good design, we cannot just get stuck in this simple outcome definitions, we need to look at the broader climate, broader environment of where those outcomes are coming from, so we can then understand the data trends better. You already mentioned this, but I would like to ask it one more time for our listeners. Do these study designs lead to errors in understanding the true disparities of interest? What type of errors, if they do, like you said, what type of are we talking about?
[00:15:01.670] – Wen Wei Loh
Yes, the study device is absolutely linked to potential errors in understanding the disparities. Depending on the study design that you adopt, your estimate from your analytical sample could be either larger than what the true disparity is, it could be smaller. Sometimes even the sign might be reversed. So you might find that the gender gap is of a different direction than the actual disparity. In the paper, we actually describe the numerical for example, where researchers could mistakenly conclude that there is no gender gap, even though that was truly a gender gap.
[00:15:37.800] – APS’s Özge Gürcanlı Fischer Baum
What are some practical suggestions for avoiding selection bias in disabilities research?
[00:15:43.920] – Dongning Ren
I think the From my perspective, one of the first things that we could do in our field is to become more aware, become more aware of selection bias, to know what it is, recognize its presence and the impact of it, how bad we can get. Why some of the very intuitive and popular designs give us misleading conclusions? I believe this awareness is important for both researchers who are doing the research and also the readers who are reading and interpreting and evaluating the validity of the findings.
[00:16:19.890] – APS’s Özge Gürcanlı Fischer Baum
Yeah. So your paper offers some suggestions to be thoughtful about these issues. Could you share a few of these strategies with us?
[00:16:28.240] – Wen Wei Loh
Yeah. So just like you mentioned earlier, the study design really makes a difference in terms of whether or not you can actually eliminate or reduce the chances of errors. So let’s start with outcome-dependent selection. So this is a study design that simply just rules out using commonly employed methods to resolve selection bias. So our suggestion is to avoid outcome-dependent selection at the study design stage where possible. Now, for outcome-associated selection, it is possible to mitigate selection bias, reduce the chances of errors, so-called occurring, but this will have to be addressed at both the design and the analysis stages. So we’ll start at the design stage. So first of all, at the design stage, all the covariates that are potentially common causes of your outcome and selection must be identified and accurately measured. So going back to the example that I had, the common factors such as the prestige of the university or the degree-gra nting program, professional network, academic record, causes of both an individual to be nominated as well as winning the award, these have to be identified and accurately measured. Then at the analysis stage, when you’re actually trying to estimate the effect of gender on winning an award, then these causes must be properly adjusted for or used in a sensitivity analysis to judge or gauge the impact of selection bias on the results.
[00:17:57.110] – Wen Wei Loh
So I think what I just described can be overwhelming because it can be a little bit difficult to conceptualize all these covariates that you need to think about. And in general, we actually encourage applied researchers to use causal diagrams to visualize the causal relations among variables. These variables are both measured and unmeasured. In the current context, causal diagrams can also represent the study design. So you mentioned about how you can… Earlier, you asked a question about how we can use the study design to minimize the chances of errors occurring So causal diagrams can help with that. So as an example, you could consider including a selection node in your causal diagram to explain the process or factors influencing how certain individuals were selected into your analytic sample, but then others were left out from the analysis. In our own work, we have found causal diagrams to be very intuitive and helpful in understanding the research questions and guiding causal analysis.
[00:18:59.160] – APS’s Özge Gürcanlı Fischer Baum
It’s a great suggestion. Also, it is important for replication, knowing exactly what other research groups do or that is very helpful. What about the next step? These are great suggestions. Thank you so much for your answers. In your opinion, what is the next step researchers should take to improve the rigor and accuracy of group-based disparities research?
[00:19:24.580] – Wen Wei Loh
I think before taking all these steps, the first one to me, it’s being explicit. Being explicit and acknowledging that certain research questions are causal in nature. That, to me, is a good start. Based on my reading of the literature, many questions in group-based disparities research, such as understanding the impact of social dimensions, including gender, race, SES, among many others, there are framed as causal inquiries . So going back to the question of what got me interested in this as I was reading the literature, this was what I found, where a lot of the work we’re really trying to uncover or find or to find the causality here. So for example, when we talk about the gender gap in receiving an award, we are essentially asking whether or not gender is a cause, and we want to know whether the probability of winning differs because of gender. And for such causal questions, taking the steps to define the target population, clarify your causal estimand , thoughtfully consider the study design and analytical methods, all of these will help to improve the rigor and accuracy of causal conclusions in group-based disparities research.
[00:20:36.670] – Dongning Ren
I think that’s a very good answer. That’s exactly as the thinking as well. I think before we’re thinking about the very detailed statistical methods or sensitivity analysis, what is important to think about is at the research question stage. We have a research question about association where essentially we’re asking a causal question. I think clarifying that and just Just state that explicitly will help us to think more carefully about this.
[00:21:05.860] – APS’s Özge Gürcanlı Fischer Baum
Thank you very much. This was a great conversation. Thank you, both of you. It is important for us as psychologists to talk about the methodological side of things. I truly enjoyed our conversations, and hopefully our listeners will, too. As a final question, I would like to ask you, is there a key takeaway or final thought that you would like to leave with our listeners?
[00:21:31.250] – Dongning Ren
I’d like to say that a selection bias may not be easy to detect in disparities research, but it could easily lead to wrong conclusions. We hope our paper is useful for researchers to recognize the selection bias and feel more ready to deal with it, in inequality research and a psychological science broadly. We would also like to use this opportunity to thank our editor, Dave Sbarra and the two reviewers. Our paper grew stronger and more accessible and more readable during the review process. We thank them for their time and excellent suggestions.
[00:22:13.540] – Wen Wei Loh
Thank you for having us on your podcast. It’s a real privilege to be invited, and we appreciate the opportunity to share our work with your listeners.
[00:22:21.380] – APS’s Özge Gürcanlı Fischer Baum
Yeah, thank you. It was a pleasure. Thank you very much. This is Özge Gürcanlı Fischer Baum with APS, and I have been speaking to Wen Wei Loh and Dongning Ren from Maastricht University. If you want to know more about this research, visit psychologicalscience.org. Do you have questions or suggestions for us? Please contact us at [email protected].
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.
Please login with your APS account to comment.