The Cross-linguistic Coordination of Overt Attention and Speech Production as Evidence for a Language of Vision

Read the full article See related articles

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

A central question in cognition is how representations are integrated across different modalities, such as language and vision. One prominent hypothesis posits the existence of an abstract, pre-linguistic “language of vision” that organises meaning compositionally, enabling cross-modal integration. This hypothesis predicts that the language of vision operates universally, independent of linguistic surface features such as word order. We conducted eye-tracking experiments where participants described visual scenes in English, Portuguese, and Japanese. By analysing spoken descriptions alongside eye-movement sequences divided into the planning and articulation of utterances, we demonstrate that semantic similarity between sentences strongly predicts the similarity of associated scan patterns in all three languages, even across scenes and for sentences in different languages. In contrast, the effect of syntactic similarity was secondary and transient: it was restricted to within-language and within-scene comparisons, and temporally confined to the early planning phase of the utterance. Our findings support interactive accounts in which a universal language of vision provides stable semantic scaffolding for cross-modal coordination, while syntax acts as a local constraint, primarily active during the linearization of the message

Article activity feed