Biologically Inspired Deep Neural Network Models for Visual Emotion Processing
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
The perception of opportunities and threats in complex visual scenes represents one of the main functions of the human visual system. The underlying neurophysiology is often studied by having observers view pictures varying in affective content. While deep neural networks (DNNs) have shown promise in modeling visual recognition of objects, their capacity to model visual affective processing remains to be better understood. In this study, we proposed a biologically inspired deep neural network model, referred to as the Visual Cortex Amygdala (VCA) model, that integrates a vision transformer with an amygdala-mimetic module designed to reflect both the anatomical hierarchy and self-attention-based computational mechanisms of the human brain. We evaluated the model along three dimensions: (1) predictive accuracy for emotional valence and arousal, (2) representational alignment with human amygdala activity, and (3)neural mechanisms of emotion representation. Our results showed that (1) the model can predict with high accuracy human emotion ratings on 1,182 images from the International Affective Picture System (IAPS) dataset (valence: r ≈ 0.9; arousal: r ≈ 0.7), (2) the model’s internal representations aligned with functional Magnetic Resonance Imaging (fMRI) data from the human amygdala, and (3) at the single neuron level, the amygdala module evolved emotion selectivity, and at the neuronal population level, the representational geometry became progressively more aligned with psychological models of emotion. Additional issues explored included (1) the effect of visual encoding and (2) the effect of structure and computational mechanisms on emotional assessment.