Determinants of visual ambiguity resolution
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Visual inputs during natural perception are highly ambiguous: objects are frequently occluded, lighting conditions vary, and object identification depends significantly on prior experiences. However, why do certain images remain unidentifiable while others can be recognized immediately, and what visual features drive subjective clarification? To address these critical questions, we developed a unique dataset of 1,854 ambiguous images and collected more than 100,000 participant ratings evaluating their identifiability before and after seeing undistorted versions of the images. Relating the representations of a brain-inspired neural network model in response to our images with human ratings, we show that subjective identification depends largely on the extent to which higher-level visual features from the original images are preserved in their ambiguous counterparts. Notably, the predominance of higher-level features over lower-level ones softens after participants disambiguate the images. In line with these results, an image-level regression analysis showed that the subjective identification of ambiguous images was best explained by high-level visual dimensions. Moreover, we found that the process of ambiguity resolution was accompanied by a notable decrease in semantic distance and a greater consistency in object naming among participants. However, the relationship between information gained after disambiguation and subjective identification was non-linear, indicating that acquiring more information does not necessarily enhance subjective clarity. Instead, we observed a U-shaped relationship, suggesting that subjective identification improves when the acquired information either strongly matches or mismatches prior predictions. Collectively, these findings highlight fundamental principles underlying the mapping between human visual perception and memory, advancing our understanding on how we resolve ambiguity and extract meaning from incomplete visual information.