A progressive multimodal image fusion algorithm
Discuss this preprint
Start a discussion What are Sciety discussions?Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
Image fusion encounters significant challenges when dealing with diverse image types, particularly due to modality discrepancies and imbalances in information content. To address these issues, we introduce a novel three-branch progressive gated multi-scene image fusion framework (TBPGFusion) designed to efficiently integrate images from heterogeneous modalities. The proposed model consists of three primary components: a three-branch progressive feature extraction module, a feature fusion module, and a secondary gated fusion module. Within the feature extraction module, a downsampling progressive fusion branch is employed. The Cross-modality Differential Aware Fusion (CMDAF) unit is utilized to compensate for differential information by leveraging same-layer features extracted from the three downsampled branches. Subsequently, the feature fusion module employs an improved Gated Fusion Module (IGFM) in conjunction with an L2-norm strategy to perform an initial fusion of the three-branch feature representations. The secondary gated fusion module further refines the fusion by integrating the differential information compensation features with the preliminarily fused features, again utilizing the IGFM. Through progressive upsampling, the number of channels is reduced to generate the final fused image. Experimental results indicate that the TBPGFusion framework effectively mitigates challenges associated with modality differences and information richness imbalance, demonstrating superior performance across multiple evaluation metrics. Notably, the model excels in preserving structural similarity and fine details, thereby substantially improving the overall quality of the fused images.