Metacognition of ChatGPT in confidence judgements

20260 citationsJournal Articlegold Open Access

Authors

Shun YOSHIZAWA · Tokai University

Ayako Onzo · Tokyo University of the Arts

SHIN NOZAWA · The University of Tokyo

Tsugumi Takano · Tokyo University of the Arts

Tetsuo Ishikawa · Tokyo University of the Arts

Ken Mogi · Tokyo University of the Arts

Abstract

Recent advances in Large Language Models (LLMs) have raised critical concerns regarding AI alignment and safety, particularly with respect to the reliability of their outputs. In humans, metacognition plays a key role in making cognition robust and adaptive. LLMs frequently express high confidence in their responses, raising the question of whether such confidence reflects human-like metacognitive capability. In this study, we systematically compared humans and GPT-4 across multiple task formats to examine how confidence relates to performance. GPT-4 consistently outperformed humans in task accuracy. This advantage was not accompanied by human-like confidence behavior: Human confidence closely tracked variations in accuracy, while GPT-4 was not. Humans adjusted their confidence more sensitively to changes in accuracy, whereas GPT-4 showed a shallow confidence–accuracy mapping. Humans exhibited higher and more stable metacognitive sensitivity and efficiency, while GPT-4 showed condition-specific variability. These findings reveal a dissociation between task-level performance and metacognitive behavior in GPT-4, suggesting that its confidence reflects structural properties of its outputs rather than genuine internal uncertainty monitoring. Taken together, these findings suggest that GPT-4 lacks robust metacognitive abilities compared to humans, or at least that its metacognitive processes differ significantly from those of humans.

Topics & Keywords

Artificial Intelligence in Healthcare and Education Explainable Artificial Intelligence (XAI)Ethics and Social Impacts of AI

Publication Details

Published in: Frontiers in Artificial Intelligence

Volume 9

DOI: 10.3389/frai.2026.1694192

Field-Weighted Citation Impact: 0.00

Command Palette

Metacognition of ChatGPT in confidence judgements

Authors

Abstract

Topics & Keywords

Publication Details