SCFI-ESeg: Enhancing Semantic Segmentation with Spatial and Content Feature Integration

Ning Li
Xudong Zhang
Bo Li
Baohua Yuan
Gaochao Yang

Read the full article

Discuss this preprint

Start a discussion What are Sciety discussions?

Listed in

This article is not in any list yet, why not save it to one of your lists.

Abstract

In recent years, with the development of deep learning technologies, significant progress has been made in the field of semantic segmentation. However, existing models still face challenges in utilizing spatial information and capturing multi-level and multi-scale information. To address this, this paper proposes a novel semantic segmentation model called SCFI-ESeg, built on the Segmenter framework. By integrating spatial information and content features, SCFI-ESeg significantly enhances segmentation accuracy. The SCFI-ESeg model introduces the Spatial Feature Enhancement Module (SFEM), Multi-Stage Attention Module (MStA), and Dense Continuous Atrous Spatial Pyramid Pooling (DCASPP), effectively improving the model's ability to express spatial and semantic information. The SFEM module leverages the encoder's query features to specifically enhance spatial information, thereby improving the model's perception of image details. The MStA module strengthens the interaction between high-level and low-level features through multi-stage feature fusion, effectively enhancing the integration of features at different levels. The DCASPP extracts features under varying receptive fields and merges them with pooling results, improving the network's understanding of multi-scale information. Experimental results demonstrate that SCFI-ESeg exhibits excellent performance on the public datasets ADE20K and Pascal Context, particularly in complex scenarios. Various experimental variants achieve an average improvement of 1.6\% over the baseline model on the ADE20K dataset, and a 2.5\% improvement when using the ViT-tiny configuration. Additionally, the model maintains a low computational cost and parameter count, showcasing its practicality.

Version published to 10.21203/rs.3.rs-5280644/v1 on Research Square
Oct 29, 2024

DE-Net: A Density-Aware and Edge-Enhanced Network for High-Resolution Building Segmentation

This article has 3 authors:
1. Guanjun Huang
2. Rui Wu
3. Liang Qiao
This article has no evaluationsLatest version Dec 19, 2025
An effective framework for accurate semantic segmentation of high-resolution remote sensing images.

This article has 6 authors:
1. Wambugu Naftaly
2. Ruisheng Wang
3. Abubakar Sani-Mohammed
4. Bo Guo
5. Xinchang Zhang
6. Zhijun Wang
This article has no evaluationsLatest version Jan 20, 2026
Frequency-Spatial Dual Perception: Enabling Efficient and Accurate Medical Image Segmentation

This article has 4 authors:
1. Daxin Chen
2. Jiahua Wu
3. Xu-Yao Zhang
4. Da-Han Wang
This article has no evaluationsLatest version Dec 17, 2025

Discuss this preprint

Listed in

Abstract

Article activity feed

Related articles

DE-Net: A Density-Aware and Edge-Enhanced Network for High-Resolution Building Segmentation

An effective framework for accurate semantic segmentation of high-resolution remote sensing images.

Frequency-Spatial Dual Perception: Enabling Efficient and Accurate Medical Image Segmentation