Decoupling lower-level and higher-level visual features in naturalistic scenes
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
In the natural world, higher-level visual features (e.g., scenes and objects) systematically co-occurwith specific lower-level features (e.g., textures and patterns). This natural covariation makes itdifficult to isolate aspects of behavior or brain activity that are specific to one level of the featurehierarchy. For example, a particular brain or behavioral response may reflect recognition of a“jungle” scene, or it may reflect sensitivity to leaf textures and dense vertical lines that constitutejungle scenes. To address this challenge, we developed a novel approach for decoupling higher-and lower-level features in naturalistic images. We employed a Stable Diffusion image generationframework to create a naturalistic image set in which scene–object pairings and texture–patternpairings are factorially combined, enabling independent manipulations of higher-level andlower-level image content. To validate our approach, we used state-of-the-art image processingmodels to confirm that higher-level features are represented similarly regardless of theirlower-level constituent features and, conversely, that lower-level features are representedsimilarly regardless of the higher-level objects/scenes to which they contribute. We thenconfirmed that human participants can readily and rapidly categorize these features in ourgenerated images. We present the image generation framework and introduce the SPOT(Scenes, Patterns, Objects, Textures) Grid, a publicly available stimulus set accompanied byimage-specific metrics (upon request: https://forms.gle/X2HpzRYUGigFS1S27). This resourceprovides new opportunities for investigating hierarchical visual representations, feature binding,and abstraction across perception, behavior, and the brain.