The psychophysics of compositionality: Relational scene perception occurs in a canonical order

Zekun Sun
Chaz Firestone
Alon Hafri

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

We see not only objects and their features (e.g., glass vases or wooden tables) but also relations between them (e.g., a vase on a table). An emerging view accounts for such relational representations by positing that visual perception is compositional: Much like language, where words combine to form phrases and sentences, many visual representations contain discrete constituents that combine systematically. This perspective raises a fundamental question: What principles guide the composition of relational representations, and how are they built over time? Here, we tested the hypothesis that the mind constructs relational representations in a canonical order. Inspired by a distinction from cognitive linguistics, we predicted that 'reference' objects (typically large, stable, and able to physically control other objects; e.g., tables) take precedence over 'figure' objects (e.g., vases) during scene composition. In Experiment 1, participants who arranged items to match linguistic descriptions (e.g., "The vase is on the table", "The table is supporting the vase") consistently placed reference objects first (e.g., table, then vase). Experiments 2–5 extended these findings to visual recognition itself: participants were faster to verify scene descriptions when reference objects appeared before figure objects in a scene, rather than vice versa. This Reference-first advantage emerged rapidly (within 100 ms), persisted in a purely visual task, and reflected abstract principles (e.g., physical forces) beyond simple differences in size or shape. Our findings reveal psychophysical principles underlying compositionality in visual processing: the mind builds representations of object relations sequentially, guided by the objects' roles in those relations.

Version published to 10.31234/osf.io/97z4n_v3 on OSF Preprints
Oct 7, 2025
Version published to 10.31234/osf.io/97z4n_v2 on OSF Preprints
Sep 20, 2025
Version published to 10.31234/osf.io/97z4n_v1 on OSF Preprints
Apr 23, 2025

How cognitive salience and cue frequency shape grammar: evidence from animacy

This article has 4 authors:
1. Francesca Franzon
2. Valentina N. Pescuma
3. Alessia Zampieri
4. Davide Crepaldi
This article has no evaluationsLatest version Jan 23, 2026
How cognitive salience and cue frequency shape grammar: evidence from animacy

This article has 4 authors:
1. Francesca Franzon
2. Valentina N. Pescuma
3. Alessia Zampieri
4. Davide Crepaldi
This article has no evaluationsLatest version Jan 23, 2026
Surprise isn’t symmetrical: Adults’ looking suggests non-perceptual considerations during dishabituation

This article has 5 authors:
1. Qiong Cao
2. Anjie Cao
3. Gal Raz
4. Joshua Tenenbaum
5. Shari Liu
This article has no evaluationsLatest version Jan 23, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

How cognitive salience and cue frequency shape grammar: evidence from animacy

How cognitive salience and cue frequency shape grammar: evidence from animacy

Surprise isn’t symmetrical: Adults’ looking suggests non-perceptual considerations during dishabituation