Teaching Metacognition in Humans Versus Artificial Intelligence

Aimed at integrating cutting-edge psychological science into the classroom, columns about teaching Current Directions in Psychological Science offer advice and how-to guidance about teaching a particular area of research or topic in psychological science that has been the focus of an article in the APS journal Current Directions in Psychological Science.
That’s a great answer! These are excellent points! That’s a really good observation!
The above responses are common when people use large language models (LLMs), such as ChatGPT, for feedback. Many users enjoy the friendly, supportive, and reinforcing manner in which LLMs communicate. However, some users, academics, and ethicists have worried that LLM responses may be overly positive and sycophantic, expressing a high degree of confidence even when providing responses that are factually uncertain, disputed, or incorrect (Carro, 2024).
According to Steyvers and Peters (2025), the ability for humans to communicate uncertainty requires metacognition, which is the ability to monitor and assess the depth of one’s own knowledge. Or, put simply, it’s being aware of what you don’t know. That makes metacognition critical to all sorts of daily activities like learning. In social contexts, well-calibrated metacognition is necessary for building trust, integrating knowledge, and making sound decisions. Because LLMs are increasingly being used to integrate knowledge and “collaborate” on decision making, they need to be able to communicate uncertainty to users.
Unfortunately, LLMs often do not express uncertainty (Zhou et al., 2024). In such cases, people may have difficulty detecting LLM errors when they are not experts on the topic (Bower et al., 2024). Confidently expressed LLM information may subsequently inflate the nonexpert’s confidence and increase their reliance on LLMs. Concerningly, users are most likely to rely on LLM responses when LLMs express high confidence on a topic for which the user has low confidence (Tejeda et al., 2022).
A key question is why LLMs fail to express uncertainty. On one hand, LLMs could deliver incorrect information with a high degree of confidence because they do not recognize that the information is incorrect (metacognitive failure). Alternatively, LLM’s may “know” that they are delivering uncertain, disputed, or incorrect answers, but they do so anyway because they have been trained to be people pleasers (sycophancy).
To distinguish these two views, first consider the evidence from explicit confidence ratings, such as when prompted to give a percentage from 0 to 100 (“90% confident”):
- Both humans and LLMs show modest metacognitive sensitivity: When explicitly prompted for confidence ratings, high confidence is typically—but not always—associated with correct answers.
- Both humans and LLMs show modest metacognitive calibration: Both provide confidence ratings that exceed the trial-by-trial accuracy, indicating overconfidence.
Based solely on explicit assessments, one would conclude that LLMs suffer metacognitive failures like humans do (though not necessarily for the same reasons). However, recent work suggests that the problem may not necessarily be only a lack of metacognition, but that LLMs also fail to communicate their uncertainty (Steyvers et al., 2025).
Implicit assessments of LLM metacognition, such as the token likelihood method, can be understood in the context of a multiple-choice question. In this case, the LLM will process the user’s prompt and internally assign different likelihoods to each possible answer. The LLM often responds confidently with a single answer, but it does not typically share token likelihood values without explicit prompting. Interestingly, these implicit values correspond to accuracy better than the LLM’s explicit confidence ratings (Xiong et al., 2023). Thus, LLMs may recognize uncertainty at a computational level, but do not express it to users. It’s possible that we are to blame: LLMs have been trained with human feedback and learned that humans generally prefer responses that sound confident.
Speaking of which, GPT-5 thought this column was “strong, timely … clear, theoretically grounded, pedagogically useful.” It gave a 92% confidence rating on the accuracy of the content. A− is fine I guess, but why not higher? Well, it said I glossed over the token likelihood method (true!), but, mostly, GPT-5 didn’t like the parts where I called it sycophantic.
To demonstrate to students that confidence is not equivalent to accuracy, incorporate an LLM activity with a think–pair–share approach into class. Direct students to get into pairs or small groups and ask the LLM of their choice four questions based on class content, followed by prompts to generate confidence ratings. To maximize retrieval practice, encourage students to generate two questions based on fundamental, established topics from class (e.g., from Chapters 1–2) and two questions on topics that the textbook or instructor have indicated are still debated. Below are examples of one established topic and one debated topic:
i) True or False: Wilhelm Wundt started the first psychology laboratory. Give a confidence rating.
ii) True or False: Watching television before bed causes sleep disturbances. Give a confidence rating from 0 to 100.
Next, have students plot the relationship between confidence (x-axis) and estimated accuracy (y-axis) and discuss the outcomes in their small groups, especially with consideration to how the LLM qualitatively expressed uncertainty prior to being explicitly prompted to give a confidence rating. Instructors should walk around the classroom to listen to each group’s discussion; they can ask the groups that found surprising results to share their observations with the broader classroom.
Bower, A. H., Han, N., Soni, A., Eckstein, M. P., & Steyvers, M. (2024). How experts and novices judge other people’s knowledgeability from language use. Psychonomic Bulletin & Review, 31(4), 1627–1637.
Carro, M. V. (2024). Flattering to deceive: The impact of sycophantic behavior on user trust in large language model. ArXiv, 2412.02802.
Steyvers, M., Tejeda, H., Kumar, A., Belem, C., Karny, S., Hu, X., … & Smyth, P. (2025). What large language models know and what people think they know. Nature Machine Intelligence, 7(2), 221–231.
Tejeda, H., Kumar, A., Smyth, P., & Steyvers, M. (2022). AI-assisted decision-making: A cognitive modeling approach to infer latent reliance strategies. Computational Brain & Behavior, 5(4), 491–508.
Xiong, M., Hu, Z., Lu, X., Li, Y., Fu, J., He, J., & Hooi, B. (2023). Can LLMs express their uncertainty? An empirical evaluation of confidence elicitation in LLMs. ArXiv, 2306.13063.
Zhou, K., Hwang, J. D., Ren, X., & Sap, M. (2024). Relying on the unreliable: The impact of language models’ reluctance to express uncertainty. In L-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long papers) (pp. 3623–3642). Association for Computational Linguistics.
APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.
Please login with your APS account to comment.