ML-RID: Mutual Learning Enhanced Visible-Infrared Person Re-identification

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

In computer vision, Person Re-Identification (ReID) is essential for surveillance, matching images of individuals across different cameras and lighting conditions. Despite progress in visible light scenarios using CNNs, methods struggle in low-light, prompting the need for Visible-Infrared (VI) ReID. The challenge lies in bridging the semantic gap between visible and infrared modalities due to their distinct imaging properties. Existing methods are limited, focusing on complex feature extractors without fully utilizing intermediate representations to reduce cross-modality discrepancies. Infrared images lack color information, and differences in reflectivity between modalities hinder the extraction of robust features for identification. To alleviate these problems, We introduce ML-RID, leveraging Mutual Learning (ML) to enhance the performance of the VI-ReID task. Specifically , we propose an Adaptive Feature Fusion Module (AFFM) that dynamically fuses visible and infrared features, compensating for semantic deficiencies and enhancing feature representation. Then, a Feature Projection Module (FPM) aligns predictions across modalities, while the Jensen-Shannon Divergence is utilized to ensure the prediction consistency and reliability, encouraging learning from each modality’s strengths. A penal term is proposed to maintain auxiliary modality diversity, aiding in knowledge transfer as a “teacher”. Experiments on two datasets show ML-RID outperforms current models, with visualizations verifying its effectiveness, marking a step forward in cross-modality Re-ID.

Article activity feed