LAFNet: Lightweight attention fusion network for real-time semantic segmentation
Listed in
This article is not in any list yet, why not save it to one of your lists.Abstract
As an important research topic in the field of computer vision in recent years, real-time semantic segmentation is a key technology for scene understanding. However, existing models struggle to meet practical industrial requirements due to their large computational overhead and redundant parameters. In this paper, we propose a lightweight attention fusion network (LAFNet) to alleviate the current challenges faced in the field of semantic segmentation. First of all, we propose the deep feature extraction (DFE) module. In the encoder section, this module acquires contextual information about multiple stages of the input image and obtains a sufficient receptive field. Next, in order to better recover the original resolution in the decoding part, we propose the shallow detail extraction (SDE) module to preserve more spatial detail information. At the same time, the feature information fusion (FIF) module is designed to better fuse semantic and spatial information. Finally, in the decoder stage, a multiscale information fusion (MIF) module is designed to reuse feature information from different stages and to fuse the different stage features better. The experimental results show that LAFNet achieves 71.5% mloU and 67% mloU on the Cityscapes dataset and Camvid dataset respectively with only 0.56M parameters without any pre-training and pre-processing. Compared to most existing state-of-the-art models, LAFNet uses fewer parameters while maintaining a high level of segmentation accuracy.