DTULC Dataset and CDGANet: Advancing Urban Land Cover Segmentation with High-Resolution Satellite Imagery

Read the full article See related articles

Listed in

This article is not in any list yet, why not save it to one of your lists.
Log in to save this article

Abstract

Semantic segmentation of high-resolution remote sensing imagery remains challenging due to the coexistence of multiscale objects, complex spatial contexts, and ambiguous boundaries. While existing convolutional and transformer-based methods have made strides in natural scene understanding, they often underperform on remote sensing data due to insufficient detail retention and inefficient multi-scale feature modeling. To address these limitations, we propose CDGANet, a novel architecture integrating a Cross-Layer Detail-Aware Module (CDM) and Group Collaborative Attention Mechanism (GCAM). Leveraging ConvNeXt as the backbone, CDGANet employs CDM to fuse high-level semantics with low-level textures via self-attention, preserving boundary precision, while GCAM processes multi-scale features in parallel groups to enhance small-object discrimination. We introduce the DTULC dataset, a high-resolution urban land cover benchmark derived from Gaofen-2 satellite imagery, capturing diverse landscapes in Datong City, Shanxi Province. Experiments demonstrate CDGANet’s superiority over UNet, PSPNet, and SwinTransformer, achieving state-of-the-art performance with 74.23% mPA, 58.91% mIoU, and 72.24% F1-score on DTULC. Ablation studies confirm that GCAM and CDM jointly improve mIoU by 7.07% over baseline models. This work advances fine-grained land cover analysis and offers practical value for ecological monitoring and sustainable urban planning.

Article activity feed