DHAFGan: A Dense Hybrid Attention Fusion Generative Adversarial Network for Infrared and Visible Image Fusion

Qiong Hong
Zhonghua Xu
Dongli Qin
Yuhui Zheng
Hao Zhang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

Aiming at the problems existing in the current infrared and visible light image fusion algorithms, such as insufficient perception of typical features, poor visual representation of the fusion results, and insufficient utilization of important secondary information, this paper proposes an infrared and visible light image fusion algorithm based on shallow-deep feature extraction and dual-channel hybrid attention. Firstly, a shallow-deep feature extraction module is constructed. This module utilizes shallow convolutional layers and deep multi-scale receptive field units to extract surface-level features and deep semantic information from the source images, respectively, thereby achieving multi-level multimodal feature extraction. Secondly, Dual-Channel Hybrid Attention Fusion Module (DCAFM) is constructed. Spatial attention is focused on the salient areas of the image, and channel attention is used to strengthen the feature channels, thereby enhancing the fusion ability of multimodal features. Finally, primary and secondary feature loss functions are formulated to constrain both the generator and discriminator, facilitating the extraction of latent secondary feature information from the source images. Experimental results on the DroneVehicle dataset demonstrate that the proposed algorithm achieves superior performance in both subjective visual evaluation and objective metrics. Quantitative evaluations show that our method outperforms seven state-of-the-art approaches, achieving the highest scores in standard deviation (SD=9.3541), mutual information (MI=2.4321), and peak signal-to-noise ratio (PSNR=65.7852), while ranking second in average gradient (AG=3.9854). The fused images generated by our method not only align with human visual perception characteristics but also retain rich detailed information, effectively preserving both dominant and subtle features from the source modalities.

Version published to 10.31223/x5v777
Mar 21, 2026

Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion

This article has 4 authors:
1. Yongli Xian
2. Heng Zhou
3. Zhijie Gong
4. Congzheng Wang
This article has no evaluationsLatest version Mar 19, 2026
Fusion-VTT: Visual-Tactile-Text Fusion Learning for Robotic Object Recognition

This article has 3 authors:
1. Li Yang
2. Liangqi Zhang
3. Min Li
This article has no evaluationsLatest version Mar 13, 2026
MFFP-Net: Multi-directional Feature Fusion and Position-Aware Network

This article has 4 authors:
1. Yazhong Si
2. Jingyu Chen
3. Hongxu Li
4. Chen Li
This article has no evaluationsLatest version Mar 9, 2026

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

Ro-FusionGAN:An Adversarial Framework for High-Quality Multi-focus image fusion

Fusion-VTT: Visual-Tactile-Text Fusion Learning for Robotic Object Recognition

MFFP-Net: Multi-directional Feature Fusion and Position-Aware Network