Method of Multi-Label Visual Emotion Recognition Fusing Fore-Background Features
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
To address the issues that visual-based multi-label emotion recognition often overlooks the impact of background that the person is placed and foreground such as social interactions among different individuals on emotion recognition, simplifies multi-label recognition tasks into multiple binary classification tasks, and ignores the global correlations among different emotion labels, this paper proposes a method of multi-label visual emotion recognition fusing fore-background features. This method consists of two components: Fore-Background-Aware Emotion Recognition model (FB-ER) and Multi-Label Emotion Recognition Classifier model (ML-ERC). FB-ER is a three-branch multi-feature hybrid fusion network. It effectively extracts body features through the design of a Core Region Unit (CR-Unit), represents background features as background keywords, and extracts depth map information to simulate social interactions among different individuals as foreground features. These three features are fused both at the feature level and the decision level. ML-ERC captures the relationships among different emotion labels by designing label co-occurrence probability matrix and cosine similarity matrix, and utilizing Graph Convolutional Networks (GCN) to learn the correlations between different emotion labels, thereby generating a classifier that accounts for emotion relevance. Finally, the visual features are combined with the object classifier to enable multi-label recognition of 26 different emotions. The proposed method is evaluated on the Emotic dataset, demonstrating significant improvements over the state-of-the-art methods. Specifically, the mAP increased by 0.732% and the Jaccard Coefficient increased by 0.007.