A trade-off between reasoning ability and metacognitive sensitivity in large language models
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Modern large language models (LLMs) exhibit remarkable reasoning abilities, yet it remains unclear whether gains in reasoning ability are accompanied by corresponding improvements in metacognitive sensitivity, which refers to the ability to discriminate, on an item-by-item basis, between correct and incorrect inferences. Here we systematically evaluate this relationship in five experiments across 17 models spanning five LLM series. Although reasoning ability reliably scales with model size, metacognitive sensitivity does not; in some series, the smallest models even exhibit higher metacognitive sensitivity than their larger counterparts. This deviation from scaling law arises from the intermediate reasoning steps produced by large models: long and deterministic reasoning traces boost performance but hinder accurate self-evaluation. Furthermore, distilling the reasoning traces of large models into small models transfers the impairment in metacognitive sensitivity. These findings demonstrate that metacognitive sensitivity is not an automatic by-product of improved reasoning.