Deeply Supervised Self-Attention Learning Model for Person Re-Identification

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Person Re-Identification (Re-ID) involves matching pedestrian images across camera networks with non-overlapping views. This task is challenging due to variations in illumination, viewpoints, background clutter, and occlusions, which often degrade model performance and consistency. Addressing these issues requires features that are robust to such variations. We introduce a deep self-attention module for Re-ID, designed to learn and fuse spatial and cross-channel relationships throughout the feature extraction process. By decomposing attention into channel and spatial dimensions, our approach enhances the robust-ness of the learned features. The module uses global feature aggregation and normalization, resulting in more discriminative and complementary features. Our model is structured as a fine-grained classification problem, optimized using multi-class cross-entropy loss, and enhanced with losses from deeply supervised intermediate layers, partitioned regions, and final classification. Extensive experiments on the DukeMTMC-ReID and Market-1501 datasets show that our model surpasses the baseline, with improvements of 6.6% and 6.2% in Rank-1 accuracy, respectively. Additionally, our model delivers superior and competitive performance compared to current state-of-the-art methods. (Code for this work is available at https://github.com/bmiftah/DSP-Person-ReID.)

Article activity feed