Multilingual Computational Models Capture a Shared Meaning Component in Brain Responses across 21 Languages
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
At the heart of language neuroscience lies a fundamental question: How does the brain process the rich variety of languages? Multilingual neural network models offer a way to answer this question by representing linguistic content across languages in a shared space. Leveraging these advances, we evaluated the similarity of linguistic representations in speakers of 21 languages. We combined existing (12 languages across 4 language families) and newly collected fMRI data (9 languages across 4 families) to test encoding models predicting brain activity in the language network using representations from multilingual models. Model representations reliably predicted brain responses within each language. Critically, encoding models can be transferred zero-shot across languages, so that a model trained to predict brain activity in a set of languages can account for responses in a held-out language. These results imply a shared cross-lingual component, which appears to be related to a shared meaning space.