Word meaning, not surface statistics, is essential for predictive language processing

Andrey Zyryanov
Victoria Pierz
Yulia Oganian

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Humans comprehend language incrementally, updating the representation of sentence meaning with each incoming word. These updates are guided by the distance between each perceived word and prior expectations—the prediction error. The alignment between large language models (LLMs) and cortical activity inspires the hypothesis that the cortical computation of prediction error is Surface-based , driven by statistical patterns of word form co-occurrence. In contrast, psycholinguistic models propose that prediction error computation is Meaning-based , driven by word semantics. We used polysemic words with ambiguous semantics to distinguish these models: ambiguity would introduce uncertainty into meaning representations and hence the prediction error, if Meaning-based , but would not affect the prediction error, if Surface-based . We examined how ambiguity influenced prediction error signatures in self-paced reading times and magnetoencephalographic (MEG) neural responses during sentence processing. While an LLM-based proxy of prediction error robustly predicted reading times and neural responses to unambiguous words, it failed to predict either under ambiguity. That is, prediction error computation was altered by uncertainty in word meaning, which supports the Meaning-based model and corroborates the essential role of word meaning in predictive language processing. Our findings highlight an important limitation of LLMs as in silico models of the human language faculty.

Version published to 10.64898/2026.05.15.724229 on bioRxiv
May 15, 2026

Beyond next-word prediction: hierarchical linguistic composition modulates LLM-brain alignment in time

This article has 2 authors:
1. Junyuan Zhao
2. Jonathan R. Brennan
This article has no evaluationsLatest version May 16, 2026
Temporal Dissociation of Syntactic Disambiguation and Memory Retrieval during Sentence Processing: Naturalistic MEG Evidence from Interpretable Models

This article has 5 authors:
1. Donald Dunagan
2. Dylan Scott Low
3. Shisen Yue
4. Lars Meyer
5. John T. Hale
This article has no evaluationsLatest version Apr 21, 2026
Meaning for reading pseudowords: errors reveal semantic influences on pseudoword reading after stroke

This article has 6 authors:
1. Ryan Staples
2. Elizabeth J. Anderson
3. Sara M. Dyslin
4. Alycia B. Laks
5. Andrew T. DeMarco
6. Peter E. Turkeltaub
This article has no evaluationsLatest version May 15, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Beyond next-word prediction: hierarchical linguistic composition modulates LLM-brain alignment in time

Temporal Dissociation of Syntactic Disambiguation and Memory Retrieval during Sentence Processing: Naturalistic MEG Evidence from Interpretable Models

Meaning for reading pseudowords: errors reveal semantic influences on pseudoword reading after stroke