Human-like sequential sound-to-meaning transfer drives artificial speech comprehension

Shenshen Zhang
Siqi Li
Ruolin Yang
Guanpeng Chen
Xing Tian
Qian Wang
Fang Fang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Artificial intelligence has reached a pivotal threshold. Multimodal large models can approach human-level speech comprehension by rapidly transforming sound into meaning. However, whether this process relies on human-like mechanisms remains unknown. Here, we compared the human brain with twelve speech language models (SLMs) using a phonology–semantics confusion paradigm. Stereo-electroencephalography revealed two mechanisms of phonology-to-semantics (P2S) transfer in the human brain: a local sequential transformation within specific neuronal populations, and a global cross-regional hierarchy of P2S representations. Only brain–model alignment in the local sequential manner predicted model performance. Correspondingly, targeted lesioning of local sequential P2S-transfer model units markedly impaired comprehension performance, while activation steering of these units improved performance. In addition, such local sequential P2S-transfer model units were identified across languages. Together, this study establishes local sequential P2S transformation as a fundamental computational principle shared across biological and artificial intelligence, offering a mechanistic bridge for future brain-inspired speech systems.

Version published to 10.64898/2026.05.13.723203 on bioRxiv
May 15, 2026

Bridging the neural synchronization to linguistic structures and natural speech comprehension

This article has 4 authors:
1. Jordi Martorell
2. Giovanni M. Di Liberto
3. Nicola Molinaro
4. Lars Meyer
This article has no evaluationsLatest version Mar 25, 2026
Premotor cortex uses a compositional neural geometry to plan words

This article has 15 authors:
1. Benyamin Abramovich Krasa
2. Erin M. Kunz
3. Foram Kamdar
4. Donald Avansino
5. Nick Hahn
6. Akansha Singh
7. Nicholas S. Card
8. Maitreyee Wairagkar
9. Carrina Iacobacci
10. Leigh R. Hochberg
11. David M. Brandman
12. Sergey D. Stavisky
13. Jaimie M. Henderson
14. Francis R. Willett
15. Shaul Druckmann
This article has no evaluationsLatest version Apr 30, 2026
Beyond next-word prediction: hierarchical linguistic composition modulates LLM-brain alignment in time

This article has 2 authors:
1. Junyuan Zhao
2. Jonathan R. Brennan
This article has no evaluationsLatest version May 16, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Bridging the neural synchronization to linguistic structures and natural speech comprehension

Premotor cortex uses a compositional neural geometry to plan words

Beyond next-word prediction: hierarchical linguistic composition modulates LLM-brain alignment in time