Distinct contributions of memorability and object recognition to the representational goals of the macaque inferior temporal cortex
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The primate inferior temporal (IT) cortex, at the apex of the ventral visual stream, encodes information that supports diverse representational goals—from recognizing objects to determining which images are likely to be remembered. Specific artificial neural networks (ANNs), that currently serve as the leading computational hypotheses of ventral stream processing, are typically trained exclusively for object recognition. We asked whether incorporating image memorability as an additional optimization objective could improve ANN–brain alignment. Models optimized for memorability explained additional, non-overlapping variance in IT responses beyond that captured by recognition-optimized networks, indicating that memorability and recognition rely on partly independent dimensions of IT representation. Notably, these models also exhibited fewer non–brain-like units, bringing their representational geometry closer to that of IT. Furthermore, networks jointly optimized for both objectives were more predictive of human memorability than memorability-only models, while maintaining their alignment with human object recognition performance patterns. Together, these findings suggest that IT encodes multiple representational goals and that models trained solely for recognition provide an incomplete account of ventral stream computation.
Significance
Brain regions often serve multiple representational goals, and identifying those goals is critical because they provide the key to building better encoding models of the system. The primate ventral visual stream has traditionally been understood as a pathway for object recognition, with the inferior temporal (IT) cortex regarded as its core substrate. However, IT responses also predict image memorability—a robust phenomenon whereby some images are consistently remembered better than others. Here we show that memorability constitutes a separable representational goal of IT. ANNs optimized for memorability explained neural variance not captured by recognition models, and the two objectives produced distinct representational geometries. Critically, models jointly optimized for both recognition and memorability provided the best match to IT responses, improved prediction of human memorability, and preserved recognition performance. These findings highlight memorability as an organizing principle of IT and demonstrate that multi-goal optimization yields more brain-like computational models of vision.