Can Centaur Truly Simulate Human Cognition? The Fundamental Limitation of Instruction Understanding

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Recent advances in cognitive modeling have demonstrated the potential of large language models (LLMs) to unify diverse aspects of human cognition. The Centaur model, an LLM fine-tuned on cognitive tasks, achieves high performance across 160 psychological experiments, suggesting that a single model may capture multiple cognitive processes. However, whether this success stems from genuine task understanding or exploitation of superficial statistical cues remains unclear. To test this, we systematically manipulated Centaur’s input by (1) removing task instructions, (2) removing all contextual information, and (3) providing misleading instructions. All three manipulations remove information necessary for humans to perform the tasks. Results show that Centaur often maintains high performance under these manipulations, outperforming both baseline cognitive models and the unfine-tuned LLM (Llama) that receives correct instructions. These findings indicate that Centaur’s success likely relies on superficial statistical cues rather than true instruction comprehension. Our study highlights the need for more diverse out-of-distribution tests for LLM-based cognitive models.

Article activity feed